Malvag.io

From Chatbots to Agents: The Paradigm Shift Nobody Has Actually Understood


In the last six months, the technical capabilities of LLMs have done backflips that most companies haven’t genuinely processed yet.

We’re not talking about chatbots anymore — those are old news. Honestly, I managed to stuff one into my own site, malvag.io, which runs on a two-euro-a-month machine from Aruba. That’s how trivial it’s become.

What we can actually talk about now are agents: agents that open terminals, navigate browsers, modify files, execute deploys.

We’ve gone from LLMs that answer to LLMs that work — and the difference is substantial. Now that we’ve given them hands (read: access to our operating system consoles), these machines are genuinely starting to automate tasks that are anything but simple.

On top of that, open-source models are closing the gap with frontier ones fast. DeepSeek, for instance, delivers high performance, low cost, and a real on-premise deployment option.

The AI of the next few months will no longer be the exclusive domain of hyperscalers. Anyone building total dependency on OpenAI or Microsoft Copilot right now is replicating exactly the cloud lock-in it took us ten years to recognise as a problem.

And the value, for what it’s worth, is shifting from the model to the orchestration layer.

This also means security is becoming genuinely critical. The more models are deployed with flawed security policies, the larger the attack surface grows. Prompt injections have become as frequent, deliberate, and devastating as SQL injections were back in the relational database era.

And it makes sense: if an agent reads data, browses content, and can act in the real world, it can also be manipulated by that content. Cybersecurity is no longer separate from AI engineering. They’re the same discipline.

The question every CTO and CIO should be asking right now is whether they’re building security perimeters around their agents — and whether anyone on the team actually knows how to isolate them, orchestrate them, and avoid getting played by the first piece of malicious content that crosses their path.

If you don’t know the answer yet, you already have a problem.

Predictions for the Next Six Months

Here’s what I expect over the next six months:

  1. Someone will announce that an LLM has been added to the org chart — not buried in a footnote, but a real announcement, a prominent role, complete with press releases, fanfare, and the inevitable controversy.

  2. The first truly devastating prompt injection attack will hit some organisation that pushed aggressively into agentic AI without adequately protecting itself. I mean a serious breach — critical systems compromised, hundreds of thousands of customer records exfiltrated.

  3. Companies that have shipped real agentic workflows in production and said so publicly will start pulling visibly ahead of the pack — not just in efficiency, but in decision-making speed.

What do you think? And what are your predictions for the next six months?