AI Agents Are Getting Autonomy. Here's What Could

Gartner says one-third of enterprise apps will include agentic AI by 2028. That's not some distant sci-fi future—that's like, right after the next iPhone generation. We're talking about AI that doesn't just answer questions or generate text, but actually does things: schedules your meetings, executes stock trades, makes purchases. All without waiting for you to click "confirm."

Which sounds incredible until you realize we're handing decision-making power to systems we don't fully understand, can't always explain, and definitely haven't figured out how to secure yet.

IBM's Jeff Crume and Josh Spurgin recently broke down the security and governance challenges around autonomous AI agents, and honestly? The attack surface is wild. Not "we should be slightly concerned" wild. More like "this could go sideways in ways we haven't even imagined yet" wild.

The Attack Menu Is Extensive

Crume rattles off the threat list like he's seen some things. And the scary part is that most of these aren't even new vulnerabilities—they're existing AI risks that autonomous agents amplify.

First up: hijacking. Someone sends commands into your AI agent, takes control of it, and now it's working for them instead of you. The primary method? Prompt injection, which is apparently the number one attack type according to OWASP. "These prompt injections where you insert commands that the organization didn't intend to have happen and it allows me to get your AI to do things that it wasn't supposed to do," Crume explains. "Really hard problem to solve and an agent can amplify that."

Then there's model infection. Yes, AI models can get infected like software. "A lot of people don't realize that models can be infected just like software can be infected," Crume notes. Since most organizations aren't creating models from scratch—they're downloading them or using third-party services—you're inherently trusting something you didn't build. Trust, but verify. Except most people aren't doing the verify part.

Data poisoning is the subtle killer. Someone modifies your training data "in just subtle ways" and the results are devastating down the line. Crume's analogy hits: "It's like a little bit of toxin in the drinking water makes all of us sick." Your model could be fundamentally compromised before it ever goes into production.

Evasion attacks manipulate inputs to confuse the AI. Extraction attacks pull sensitive data out, piece by piece—we've already seen zero-click attacks where a user does literally nothing and data gets exfiltrated via email. And then there's good old denial of service, which Crume compares to rush hour traffic: "There is not enough asphalt for all the cars."

What's wild is that Crume describes all of this as "just an example of some of the attacks." This isn't even the complete list.

The Governance Nightmare

Spurgin frames the governance side with a story about a fictitious recruiting firm using AI to handle job applications. The AI has full autonomy to read resumes, schedule interviews, and send offers. Then it sends out an offer without human approval.

When HR asks why it happened, nobody can explain it. The reasoning is "hidden deep inside this extremely complex model." The AI also ends up favoring candidates from certain schools or backgrounds due to biased training data. Eventually, the company gets sued for discrimination.

The question Spurgin poses is the one that actually matters: who's responsible? The AI agent? The HR team using it? The vendor who sold it?

This isn't a hypothetical edge case. This is what happens when you deploy autonomous systems without thinking through oversight, transparency, and accountability. The story sounds extreme until you realize it's probably already happening somewhere.

The "You Can't Secure What You Can't See" Problem

Crume's first safeguard recommendation is almost embarrassingly basic: you need to know what AI instances are running in your environment. This includes "shadow AI"—unauthorized models that someone downloaded and spun up in a cloud instance without telling anyone.

Once you've discovered all your AI instances, you do AI security posture management: make sure they're following your organization's security policies. If it contains sensitive data, it shouldn't be public-facing. If it is public-facing, maybe require multi-factor authentication. Encrypt the data. The usual checklist stuff, except applied to AI.

Penetration testing matters too. Stand up the model, blast it with prompt injections and other attacks, see how it responds. If it rejects the malicious prompts, great. If not, you know you need additional protections before production.

Crume also advocates for AI-specific firewalls—a layer between users and the AI that examines both incoming prompts and outgoing responses. "Are you asking me to do something that's improper?" the firewall asks. If yes, reject it. On the response side, it can catch extraction attacks: "Oh, why are you leaking a ton of credit card numbers? That's not something we want to be spilling out all over the internet."

Governance Isn't Optional Anymore

Spurgin breaks governance into three pillars: lifecycle governance (approval from idea to production), risk and regulation (compliance with relevant rules), and monitoring and evaluation (making sure the thing actually works correctly in production).

The monitoring piece is critical because AI can drift over time. It might start with clean data and good outputs, then gradually get worse as it learns from biased inputs or edge cases accumulate. You need ongoing evaluation, not just a one-time audit.

He also emphasizes the need for a consolidated dashboard for compliance reporting. Which sounds boring until you're trying to prove to regulators that you're not doing discriminatory hiring or financial fraud via AI.

Security vs. Governance: You Need Both

Here's where it gets interesting. Crume and Spurgin make the case that security and governance can't exist independently. "Governance without security is fragile," Crume says. "You can set rules for fairness and transparency, but if someone can hack the model or poison the data, those rules collapse instantly."

And the flip side: "Security without governance is blind. You can lock down the system and defend it from attacks, but if the AI itself is biased, lacks oversight, or just can't be explained, you've just protected something that's already broken."

That framing actually clarifies something I've been trying to articulate about the AI safety conversation. A lot of enterprise security folks focus purely on the technical attack surface—prompt injection, data poisoning, model extraction. Meanwhile, governance people worry about bias, accountability, and explainability. But these aren't separate problems. They're the same problem viewed from different angles.

You can have the most secure AI in the world, completely locked down against external attacks, and it can still make discriminatory decisions or operate in ways nobody can explain. Conversely, you can have beautifully governed AI with perfect transparency and human oversight, and someone can still hijack it via prompt injection on day one.

The question isn't whether autonomous AI agents will become widespread—Gartner's prediction suggests that ship has sailed. The question is whether organizations will implement both security and governance before something goes catastrophically wrong, or after.

Because right now, based on the number of companies that don't even know what AI instances are running in their environment, I'm not loving our odds.

—Tyler Nakamura, Consumer Tech & Gadgets Correspondent