Why Enterprise AI Keeps Failing: The Intent Gap Nobody Talks About

In January, Klarna announced its AI customer service agent had replaced 853 full-time employees and saved the company $60 million. In the same earnings cycle, CEO Sebastian Siemiatkowski admitted the AI strategy had cost something far more valuable than those savings—and he's still trying to buy it back.

This isn't another "AI doesn't work" story. According to AI strategist Nate B Jones, it's the opposite: the AI worked too well. And the distinction between AI that fails and AI that succeeds at precisely the wrong thing might be the most important unsolved problem in enterprise AI right now.

The Klarna AI Actually Worked Perfectly

Here's what happened: Klarna's AI agent handled 2.3 million customer conversations in its first month across 23 markets and 35 languages. Resolution times dropped from 11 minutes to two. The projected savings looked spectacular.

Then customers started complaining. Generic answers. Robotic tone. Zero ability to handle situations requiring judgment.

Most people interpreted this as proof AI can't handle nuance. Jones offers a more interesting read: the AI agent was extraordinarily good at resolving tickets fast. That was just the wrong goal.

"Klarna's organizational intent wasn't 'resolve tickets fast,'" Jones explains in his analysis. "It was actually 'build lasting customer relationships that drive lifetime value in a very competitive fintech market.' Those are profoundly different goals."

A human agent with five years at the company knows when to bend a policy, when to spend three extra minutes because a customer's tone signals they're about to churn, when efficiency matters versus when generosity matters. She absorbed those values through manager decisions, veteran stories, unwritten rules about which metrics leadership actually cares about when push comes to shove.

The AI agent had a prompt. It had context. It didn't have intent.

Why 74% of Companies See Zero AI Value

This pattern extends far beyond Klarna. Deloitte's 2026 State of AI report surveyed over 3,000 leaders across 24 countries and found that 84% of companies haven't redesigned jobs around AI capabilities. Only 21% have mature models for agent governance.

Meanwhile, investment continues accelerating. Deloitte found 57% of respondents allocating 21-50% of digital transformation budgets to AI automation. Some companies are investing over half—an average of $700 million for a company with $13 billion in revenue.

Yet 74% of companies globally report they've seen no tangible value from AI. McKinsey found 30% of AI pilots failed to achieve scaled impact.

These numbers coexist because, as Jones argues, organizations have solved "can AI do this task?" at an individual level but completely failed at "can AI do this task in a way that serves our organizational goals at scale with appropriate judgment?"

Look at Microsoft Copilot. Despite billions in investment and 85% Fortune 500 adoption, Gartner found only 5% of organizations moved from pilot to larger-scale deployment. Only 3% of the total Microsoft 365 user base adopted Copilot as paid users. Reddit threads overflow with engineers at multi-billion dollar companies describing how their organizations downgraded licenses because employees preferred ChatGPT or Claude.

The standard explanation focuses on UX problems and model quality. But Jones identifies a deeper issue: "Deploying an AI tool across an organization without organizational intent alignment is like hiring 40,000 new employees and never telling them what the company does, what it values, or how to make decisions."

The Three Layers Nobody's Building

Jones maps what he calls the "intent gap" across three distinct layers:

Layer one: Unified context infrastructure. Right now, every team building agents rolls their own context stack. One team pipes Slack data through a custom RAG pipeline. Another manually exports Google Docs into a vector store. A third built an MCP (Model Context Protocol) server connecting to Salesforce but not Jira. A fourth team doesn't know the other three exist.

Anthropicintroduced MCP in late 2024 and donated it to the Linux Foundation in December 2025. OpenAI, Google, Microsoft, and 50+ enterprise partners have committed. Monthly SDK downloads approach 100 million.

But protocol adoption and organizational implementation are different things. As Jones notes, having a USB-C standard doesn't help if your company hasn't decided which ports to install, who maintains them, or what gets plugged in. The question isn't technical—it's architectural and political.

Layer two: Coherent AI worker toolkit. One person uses Claude for research and ChatGPT for drafting. Another uses Cursor for code and Perplexity for fact-checking. A third built a custom agent chain using LangGraph. None can articulate their workflow in a way that's transferable or improvable by others.

Jones distinguishes between "AI activity" (30% gains from bolting AI onto existing workflows) and "AI fluency" (300% gains from rethinking workflows around AI capabilities). But fluency doesn't scale through training alone—it scales through shared infrastructure.

Layer three: Intent engineering proper. This is the layer that almost certainly doesn't exist in your organization yet.

OKRs were designed for humans. They assume a manager can tell a direct report "here's what matters this quarter" and trust that person will interpret guidance through institutional context, professional norms, and judgment developed over months.

Agents don't have any of that. An agent doesn't know your company's OKRs unless you put them in the context window. It doesn't know which trade-offs leadership prefers unless you encode those preferences in actionable ways. It doesn't know the difference between decisions requiring escalation and decisions it should make autonomously unless you define the boundary.

"When a human employee joins a company, alignment happens through a hundred informal mechanisms," Jones explains. "You read the wiki, you have Slack chats, you develop judgment, you have happy hour with someone. None of that works for agents. Agents need explicit alignment before they start working, not six months after."

What Machine-Readable Intent Actually Looks Like

This means organizations need something that mostly doesn't exist: machine-readable expressions of organizational intent.

Not "increase customer satisfaction"—that's a human-readable aspiration. An agent needs agent-actionable objectives: What signals indicate customer satisfaction in our context? What data sources contain those signals? What actions am I authorized to take? What trade-offs am I empowered to make—speed versus thoroughness, cost versus quality? Where are the hard boundaries I may not cross?

Below that, you need delegation frameworks—principles translated into decision boundaries. Amazon's "customer obsession" works for humans because humans can interpret it through contextual judgment. An agent needs it decomposed: When customer request X conflicts with policy Y, here's the resolution hierarchy. When data suggests action A but the customer expressed preference B, here's the decision logic.

These aren't rules in the traditional sense. They're encoded judgment—the kind of organizational knowledge a senior employee carries after five years and a new hire absorbs gradually. Agents need it now.

At Klarna, the agent optimized for resolution speed because that was the objective it could measure. Nobody had encoded what mattered most: relationship quality, brand trust, customer lifetime value, contextual judgment about when to be efficient versus generous. Those objectives lived in the heads of the human agents who walked out the door.

The Race Nobody Realizes They're In

We have agents now that run for multiple weeks. Soon we'll have agents that run for multiple months. The question is no longer "can AI do this task?" but "can AI do this task in a way that serves what we actually need?"

Jones frames this as a fundamental shift: "The age of 'humans just know' is ending. Intent engineering is the discipline of making what humans know explicit, structured, and machine actionable."

Not because humans are leaving—though some will—but because the agents arriving to work alongside people cannot function without it.

The companies that figure out how to encode their actual values, trade-offs, and decision boundaries into infrastructure won't just have better AI tools. They'll have clearer organizational purpose than they've ever had before—because for the first time, they'll have been forced to articulate it explicitly.

Which raises an uncomfortable question: How many companies actually know what they want their AI to want?

— Yuki Okonkwo, AI & Machine Learning Correspondent