IBM's Take on AI Agents: Less Skynet, More

Here we go again. Another video promising to explain how AI agents "should" behave, as if we've collectively agreed on what AI agents are, let alone how they ought to operate. Grant Miller from IBM Technology has some thoughts, and honestly? They're less revolutionary than the framing suggests—but that doesn't make them wrong.

Miller's core argument is straightforward: stop building super agents. Instead, build specialized agents that collaborate. Think less HAL 9000, more road construction crew. Different roles, limited permissions, clear responsibilities. If this sounds familiar, it's because you've heard it before—just not about AI.

The Principle of Least Privilege Gets a Makeover

What Miller is describing—without quite saying it—is the principle of least privilege, a cornerstone of information security since the 1970s. Give each component of your system the minimum access it needs to do its job, nothing more. It's not sexy. It's not new. But it works, which is why it's stuck around for half a century.

"We really want to avoid both the ability to do too much and have too many privileges," Miller explains. In software engineering terms, he's talking about high cohesion: tight coupling between what an agent does and what it's allowed to access.

The question isn't whether this is sound advice—it is. The question is whether organizations will actually follow it when faced with the appeal of a system that can "do everything." History suggests they won't, at least not initially. We've watched this movie before with database permissions, API keys, and cloud IAM roles. Developers start with broad access because it's easier, then spend years trying to claw it back.

The Risk/Capability Matrix

Miller's most useful contribution is his 2x2 matrix categorizing agents by risk and capability. Low-risk, low-capability agents might pull data from an internal wiki—minimal reasoning required, minimal damage if something goes wrong. High-risk, high-capability agents might initiate payments in an accounts payable system—lots of reasoning required, significant potential for damage.

"If we think of low capability, high-risk, this is something like a finance data extractor," Miller notes. "It has read-only access to sensitive information, finance information, but it extracts it, brings it back, and summarizes it."

This framework helps, but it also reveals a tension Miller doesn't fully address: who decides where an agent falls in this matrix? What looks low-risk to a developer might look high-risk to an auditor. What seems like low capability today might become high capability as the system evolves. These categories aren't static, and they're not objective.

Ephemeral Agents and Dynamic Access

For high-capability agents, Miller advocates two approaches: make them ephemeral (they spin up, complete their task, then disappear), and give them dynamic access (permissions evaluated based on context, not predetermined).

This is where things get interesting—and complicated. Dynamic access control means evaluating permissions at runtime based on what the agent is trying to accomplish. That requires infrastructure that most organizations don't have, and it introduces latency that most applications can't tolerate. Miller's describing an ideal state, not a practical implementation guide.

The ephemeral agent concept is cleaner but raises different questions. If your agent disappears after each task, how do you audit what it did? How do you debug when something goes wrong? How do you ensure consistency across multiple invocations? These aren't insurmountable problems, but they're not trivial either.

The Human in the Loop

Miller's suggestion for high-risk, high-capability agents is to insert a human approval step. Before the accounts payable agent actually initiates that payment, it asks a human: "This action is about to happen. Do you approve that?"

This is the part where I start to wonder what we're actually building. If you need human approval for the high-stakes decisions—the ones where AI agents would theoretically add the most value—what's the agent really doing? Pattern matching? Data aggregation? Those are useful functions, but they're not the autonomous decision-making that the "agentic AI" hype promises.

There's a sleight of hand happening in much of the AI agent discourse. The valuable capabilities (autonomous decision-making, complex reasoning, adaptive behavior) are the same capabilities that introduce risk. When you mitigate the risk by requiring human oversight, you're also limiting the value. This isn't a criticism of Miller's approach—it's probably the right tradeoff. But it does suggest that the vision of fully autonomous AI agents handling complex business processes is further away than the marketing materials imply.

What's Actually New Here?

Strip away the AI-specific terminology, and Miller is describing microservices architecture with runtime access control. That's not a dig—microservices work for good reasons. But it does raise the question: what makes this "agentic" rather than just "well-designed software"?

The difference seems to be that these agents use LLMs or other AI models to make decisions about which tools to use and how to use them, rather than following predetermined logic. That's meaningful, but it also means we're adding a layer of non-determinism to systems that have traditionally valued predictability.

Miller acknowledges this: "These are really more non-deterministic. They're going to, with the reasoning, decide what it is that they need to interact with and the actions that they need to take."

Non-determinism in high-risk systems is a feature that security professionals typically try to eliminate, not introduce. Miller's framework is an attempt to contain that non-determinism, to fence it in with permissions and oversight. Whether that's sufficient depends on how reliably these agents actually reason—and we're still figuring that out.

The Hollywood Problem

Miller opens by dismissing the "Hollywood view" of AI agents—the super-agent that can do everything but inevitably goes rogue. It's a useful straw man, but it's also revealing. The reason Hollywood keeps returning to that narrative isn't because screenwriters lack imagination. It's because the super-agent is the logical endpoint of "let's make this more capable and give it more access."

The discipline required to maintain Miller's specialized, limited-access agent architecture goes against every incentive in software development. It's slower to build. It's harder to modify. It requires more coordination. Every developer who's worked on a microservices project knows the temptation to just give the service a little more access, to let it do one more thing, because that's easier than coordinating with another team.

The question isn't whether Miller's approach is technically sound—it is. The question is whether organizations will maintain that discipline as their agent systems grow more complex and the competitive pressure to ship faster intensifies. Based on how we've handled previous generations of distributed systems, I'm not optimistic.

But maybe that's fine. Maybe we'll build super agents, realize they're harder to manage than promised, spend a few years dealing with the security incidents and compliance nightmares, then gradually adopt the kind of bounded, specialized agent architecture Miller describes. We've followed that pattern with enough other technologies that there's no reason to think AI agents will be different.

The advice is solid. The likelihood of anyone following it from the start is low. And that, more than any technical limitation, might be what determines how this plays out.

—Mike Sullivan, Technology Correspondent