The Hidden Danger in Jack Dorsey's AI Management

Jack Dorsey dropped a blueprint last week for replacing middle managers with AI—what he calls a "world model"—and it got 5 million views in two days. The pitch is clean: instead of managers spending half their time synthesizing status and carrying context between teams, software maintains a living picture of everything happening across the company. Everyone queries it directly. Nobody waits for Monday meetings.

The idea has legs. AI strategy consultant Nate Jones breaks down why enterprise vendors are already rebranding around the concept, why agency founders are posting their implementations, why this feels like the next obvious step in organizational software. A huge share of what fills manager calendars—status syncs, alignment meetings, information shuttling—is work that software can do faster and cheaper today.

But Jones identifies something most of the hype is missing: "World model" is one phrase covering three completely different architectures that fail in different ways. And all three share the same blind spot.

The Invisible Failure Mode

When unconventional management experiments fail, everyone knows. Zappos adopted holacracy and satisfaction scores collapsed. Valve's hidden power structure became a documented case study. Medium's head of operations wrote publicly that the system was getting in the way.

World model failures are different. They're quiet.

"It's going to look like a system flagging a revenue dip as significant when in fact it was seasonal, and it drove a prioritization change it shouldn't have," Jones explains. "And the person who said, 'Ignore that. It happens every year'—maybe they were removed in the last reorg and now nobody catches it because the system presented the finding with a kind of calm, structured confidence."

Or the system surfaces a correlation between a feature launch and churn spike. The product team kills the feature. But the actual cause was a billing change that shipped the same week. The system couldn't tell the difference between correlation and causation—and nothing in the interface signaled that distinction.

The mechanism behind these failures: When you remove a management layer and replace it with nothing, the absence is obvious. When you replace it with a world model, information keeps flowing. Status gets synthesized. Dependencies get flagged. From a dashboard perspective, it looks like the routing function has been successfully automated.

The problem is that managers don't just route information. They edit it. They decide what matters.

Three Architectures, One Boundary Problem

Jones identifies three distinct approaches companies are taking, each with its own failure pattern.

The vector database approach fails by never drawing the line between surfacing and interpreting. Wire up your data sources, embed everything, let agents retrieve by semantic similarity. It's fast to deploy and adequate for pure information logistics. But semantic retrieval has no structural mechanism to distinguish "here's what happened" from "here's what matters." When the system returns results ranked by relevance, that ranking is an interpretation—a claim about what's important. At small scale, senior people have enough context to override bad rankings. At large scale, hundreds of people consuming system output as their primary information source, the ranking becomes reality. What the system surfaces, people act on. What it doesn't surface, people never see.

The structured ontology approach—think Palantir—fails by drawing the line too conservatively. You define the objects, relationships, and actions of your business explicitly. The AI reasons within that bounded structure. A customer is an entity with specific properties. The system cannot hallucinate relationships that don't exist in the schema. But the ontology can only represent what you've already categorized. "It handles known relationships really precisely, is blind to emergent relationships," Jones notes. "The unnamed pattern that once someone sees it reframes how you understand the business." The system is accurate about what it knows, silent about what it doesn't know. And what it doesn't know might matter most.

The signal fidelity approach—Dorsey's bet with Block—fails by assuming the signal interprets itself. Build the world model around the highest fidelity data your business generates. Transactions are facts. Money is honest. The model improves as a byproduct of doing business. But the connections between facts—why a merchant's cash flow is tightening—still require judgment. "Because the underlying signal is clean, the system's interpretive moves probably look more trustworthy than they should be," Jones argues. High signal fidelity at the input layer creates an illusion of high judgment quality at the output layer.

The Interpretive Boundary Nobody's Drawing

The core problem: when a system prioritizes, highlights, suppresses, or escalates, it's making judgments. It's deciding which anomalies to surface, which information reaches which teams, what counts as important through its relevance model.

Every one of those decisions used to be made by a human who could factor in things the system can't—organizational politics, the CEO's real priorities versus stated ones, the difference between a structural problem and a seasonal blip, the context that turns noise into signal.

"The output can look similar on the surface, but the quality of those decisions embedded in that output is really fundamentally different," Jones says. "And the organization won't feel that difference right away. It'll just feel this sort of slow degradation of decision quality that it might attribute to bad luck."

The fix isn't avoiding world models. Jones offers five principles for implementations that actually work: Signal fidelity determines your ceiling. Structure should be earned, not imposed. The model compounds only when it encodes outcomes—what happened, what was done about it, what resulted. Design for resistance, because the system only works if the team feeds it. And start now, because the moat is time, not architecture.

But the critical move is labeling your outputs. Distinguish "act on this"—factual, verified, low-risk information like status rollups and dependency flags—from "interpret this first"—judgment calls the system isn't equipped to make reliably. Trends that might be significant or might be noise. Correlations that might be causal or coincidental.

"That boundary is never going to be perfect, but you must try to draw it," Jones insists. "The difference between a world model that helps your org and one that slowly degrades it is whether the system communicates uncertainty and demands interpretation correctly."

Right now, almost every implementation actively hides that distinction. The output looks clean. The dashboards look authoritative. Nothing in the interface says: this is where the system's making a judgment call it might be getting wrong.

That's not a database choice failure or an embedding model failure. It's a fundamental failure in how the system presents information—treating high confidence and low confidence, facts and interpretations, routine and novel information all with the same salience, the same apparent certainty.

The companies that get this right will move faster than competitors still routing information through management layers. The ones that get it wrong will make thousands of small bad decisions, each one looking perfectly justified by the data, none of them obviously wrong until the cumulative effect becomes impossible to ignore.

Dev Kapoor covers open source software and developer communities for Buzzrag