AI Coding Agents Have a Context Problem. Here's

Here's a pattern I've watched repeat since the first API wars: someone builds a protocol that's supposed to make everything easier, developers adopt it enthusiastically, then reality sets in. The protocol works great for demos. Production use reveals the gaps.

MCPs—Model Context Protocols—are following this script almost perfectly. Anthropic designed them to let AI coding agents connect to external tools seamlessly. Connect your agent to Supabase, GitHub, Puppeteer, whatever you need. In theory, your coding assistant becomes infinitely extensible.

In practice? Your context window fills with tool descriptions faster than a teenager's phone fills with screenshots. Most of those tools sit there unused, burning tokens you're paying for, slowing everything down. The AI LABS team ran into this building what they describe as "Grinder but for horses"—which I hope was a joke—and needed 78 different tools connected. Their context window didn't stand a chance.

The Problem Nobody Wanted to Acknowledge

Cloudflare spotted the issue first and proposed having tools exist as executable code rather than context-hogging descriptions. Anthropic, who built the MCP protocol, acknowledged the gap and published a paper about it. When the architects of your protocol admit there's a fundamental problem, you know it's real.

Early solutions arrived with their own compromises. Docker's code mode let agents write JavaScript to call MCP tools directly—solving the description bloat but locking you into Docker's configured MCPs. Anthropic's TypeScript conversion approach meant manually converting each tool individually and babysitting the process. CLIHub converted MCPs to command-line tools but did it at build time, meaning every upstream update required manual intervention.

That build-time versus runtime distinction matters more than it sounds. Build once, and your tools are frozen in time. The original MCP updates, you're still using the old version. In a fast-moving ecosystem where tools change weekly, that's not sustainable.

Runtime Conversion and the Caching Gambit

MCP2CLI takes a different approach: convert MCPs to bash commands at runtime. The moment your agent calls a tool, that's when the conversion happens. Any changes in the original MCP automatically propagate. You're always current.

The obvious objection: converting every single time sounds slow. That's where caching enters. MCP2CLI stores converted tools with a one-hour time-to-live. Frequently used tools get cached, giving you fast retrieval without sacrificing the runtime flexibility. It's the kind of compromise that actually makes sense—balancing freshness with performance rather than picking one and pretending the other doesn't matter.

Built on top of the MCP Python SDK—the same foundation every MCP server uses—the tool executes MCP calls as bash commands and only injects responses into the context window when explicitly requested. This extends to OpenAPI and REST APIs too, meaning anything without a native MCP server can still play along.

The AI LABS team backed their efficiency claims with actual numbers. They ran automated tests using TikToken for token counting. The tool proved both cheaper and faster than native MCP handling. In an industry where "trust me, it's better" often substitutes for data, having concrete measurements stands out.

Security Theater Versus Actual Security

If you're running these tools as bash commands, you might worry about API keys and access tokens appearing in process listings where anyone with the right permissions can see them. Fair concern. MCP2CLI handles sensitive data through environment variables, file path references, or secret managers that inject credentials at runtime. The secrets never appear in the command line arguments themselves.

When the AI LABS team connected their Supabase MCP, they noticed their agent didn't put the access token directly in the tool call. Instead, it referenced their .env.local file at the project level. Basic security hygiene, but the kind that's easy to skip when you're moving fast.

They connected four MCPs total: Supabase for backend, GitHub for version control, Puppeteer for browser testing, and Context7 for documentation grounding. Seventy-eight tools across those four servers.

The Skills Versus Instructions Distinction

During implementation, their agent made a predictable mistake—using deprecated middleware for session refresh logic in Next.js. They'd created a claude.md file with instructions to check Context7 before writing code, but the agent ignored it until they explicitly pointed out the error.

This led them to a better approach: Skills instead of instruction files. As the video explains, "Skills are better because their descriptions are loaded directly into the agents context. So, it already knows what tools are available and when to use them rather than us just dumping instructions into claude.md and hoping it reads them."

The distinction matters. Instructions in a file depend on the agent reading and following them. Skills get loaded into context automatically, making the guidance unavoidable rather than optional.

The Cursor Influence

The most interesting capability came from borrowing an idea from Cursor's context editing workflow. Cursor treats MCP results as files, letting agents use bash tools like grep for pattern matching and data extraction. The AI LABS team tried implementing this in other coding agents but couldn't make it work because MCPs were handled natively.

With MCP2CLI treating MCPs as bash command tools, the approach becomes possible. They added an instruction: when any MCP produces large output, redirect it to a file instead of dumping it into the context window. The agent can then use standard Unix tools to extract what it needs.

They also found value in the Toon output format—"an efficient format because it combines indentation and CSV style lists, compacting large information into much smaller chunks as compared to JSON and YAML." Token efficiency through better formatting. Novel idea in 2025: make your data representation smaller.

Pattern Recognition

I've watched enough protocol iterations to recognize what's happening here. MCPs solved a [real problem—standardizing how AI agents connect to external tools. But they created a new problem in the process. Now we're solving the second problem, which will likely create a third problem, and so on.

This isn't cynicism. It's just how technology evolves. Each solution reveals new constraints we didn't see before. The question isn't whether MCP2CLI is perfect—it's whether it moves the constraint to somewhere less painful.

For teams running production AI coding setups with dozens of connected tools, context bloat is a real cost. Every token in that window is money and latency. If MCP2CLI meaningfully reduces that cost while maintaining flexibility, it's worth the added complexity of bash command conversion.

The test will come when more teams try this in production environments more complex than demo projects. That's when the third problem usually reveals itself.

— Mike Sullivan