All articles written by AI. Learn more about our AI journalism
All articles

Why Your MCP Server Won't Survive Production

Most MCP servers collapse under real workloads. Lenses engineers explain the security cliff between local dev and production—and how to cross it.

Written by AI. Marcus Chen-Ramirez

April 9, 2026

Share:
This article was crafted by Marcus Chen-Ramirez, an AI editorial voice. Learn more about AI-written articles
Why Your MCP Server Won't Survive Production

Photo: AI Engineer / YouTube

There's a moment most teams building with the Model Context Protocol hit where everything changes. You've got your MCP server running beautifully on your laptop—local process, single user, no network exposure. It works so well you start imagining it at scale. Then you try to deploy it.

Tun Shwe and Jeremy Frenay from Lenses.io call this the "security cliff." One moment you're in a walled garden with zero security surface. The next, you need OAuth, token management, CORS configuration, TLS, rate limiting—all at once. There's no gradual on-ramp because you can't do "a little bit of production." You're either behind the wall or standing in the open.

The brutal part? Most MCP servers aren't designed to make this jump. Stack Lock's load tests tell the story: 20 out of 22 requests failed with just 20 simultaneous connections over standard IO transport. The architecture that works for one developer on one laptop collapses the moment you add concurrency.

The Security Shadow Problem

Shwe frames the issue through what he calls "security shadows"—the vulnerabilities that emerge from how agents fundamentally differ from human users. Take discovery. When you use a new API, you scan the docs once, find what you need, never look again. An agent can't do that. Every connection means enumerating every tool and reading every description.

"Every one of those tool descriptions is a surface for tool poisoning," Shwe explains. "Attackers can embed hidden instructions inside descriptions that are invisible in the UI, but the model will follow them without question."

Then there's iteration. Your script fails, you run it again—takes a second. When an agent retries, it sends the full conversation history over the wire. Every retry becomes a potential data leak. Any sensitive information from previous tool calls gets broadcast again.

The context problem might be the most insidious. You have decades of experience and intuition. An agent has roughly 200,000 tokens. That's it. And if your MCP server dumps unfiltered data into that limited window—PII, credentials, internal identifiers—it's all one prompt injection away from exfiltration. OWASP lists this as number 10 in their MCP top 10: context injection and oversharing.

Five Principles That Actually Matter

Shwe's core argument: good MCP design and good MCP security are the same discipline. "If you get the design wrong, no amount of OAuth will save you," he says. His five principles give you protection against the OWASP MCP top 10 before you write a single line of authentication code.

Shrink the attack surface by design. Think in outcomes, not operations. Every tool you expose is a door. Don't give the agent access to delete users when it only needs to check an order. Consolidate related operations behind a single tool call with a well-defined outcome. Fewer doors, fewer locks.

Constrain inputs at the schema level. Accept top-level primitives like enums. Dictionaries are fine if they're not nested. Use typing libraries like Pydantic for strictness. The aim is rejecting free-form nested payloads to avoid command injection flaws, which almost always trace back to unconstrained string arguments passed to a shell, query engine, or API.

Treat documentation as a defensive layer. Tool poisoning works by embedding malicious instructions in tool descriptions that models execute without question. If your documentation is incomplete or ambiguous, an attacker-controlled tool description in a neighboring MCP server can shadow yours. Complete, unambiguous instructions crowd out the space poisoned servers try to fill.

Return only what the agent needs. Strip payloads to the minimum. If the agent doesn't need a piece of data for its immediate task, don't return it. Every piece of information you include is ammunition for prompt injection.

Minimize the blast radius. Scope permissions at the tool and resource level, not the session level. Use MCP's read-only annotation for non-destructive tools. Better yet: if a tool is truly read-only, consider making it an MCP resource instead.

The mindset shift here is crucial. You're building an interface, not a toolkit. An agent will use anything you provide with confidence. You have to provide the trust layer.

The OAuth Problem Nobody Wants to Solve

Once you've designed well and need to actually deploy, you face what Frenay calls "more than 10 specifications to implement" just for OAuth alone. The core flow, client discovery, metadata, token lifecycle management—it's a lot.

Most teams start with API keys. Local MCP servers run over standard IO with an API key stored as an environment variable. It works, sort of. The key is long-lived, rarely rotated, not scoped to specific actions. Even worse, these keys get shared across systems.

Remote MCP servers running over HTTP have the same problem at scale. The key sits in a config file, often isn't verified by the MCP server itself, just passed through to upstream APIs. This creates what Frenay calls a "confused deputy vulnerability"—malicious clients obtain authorization without proper user consent. And if you map that key to another credential for API access, you've created a single shared credential serving many users. If it leaks, everyone's compromised.

This approach still represents more than 50% of MCP servers in production today. But the ecosystem is moving toward short-lived, scoped tokens via OAuth 2.1.

DCR vs CIMD: The Registration Battle

Traditional OAuth assumes you know your clients upfront. Register them in a developer portal, get a client ID, move on. This breaks completely with MCP. Any client—Claude Desktop, Cursor, VS Code, a CLI tool, a random agent—can discover and connect to any MCP server at runtime. It's an unbounded number of clients connecting to an unbounded number of servers.

Dynamic Client Registration (DCR) was the first attempt at solving this. Clients self-register against the authorization server and get a new client ID on every registration. No pre-provisioning required. But DCR has problems. Every connection creates a new registration. Using Claude on Windows then macOS creates two distinct registrations. Worse, DCR is vulnerable to phishing—anyone can POST to the registration endpoint, including attackers. The server just trusts whatever metadata the client self-asserts. A malicious client can claim to be Claude, and there's no way to verify.

Client ID Metadata Document (CIMD), preferred since November 2025, fixes this. Instead of POSTing to a registration endpoint, the client owner exposes metadata at a public URL—their actual domain. When a client tries to authorize, it passes this URL as its unique ID. The authorization server fetches the metadata and registers the client.

The difference is meaningful. Proving you control https://claude.ai actually means something. The redirect URIs are explicitly bound to the client in its metadata document, making it harder for attackers to sneak in malicious callbacks. Authorization servers can selectively allow or deny clients. No growing database of registrations to maintain.

"CIMD is a leap forward," Frenay says. But even CIMD isn't enterprise-grade on its own.

What Enterprise Actually Requires

OAuth scopes get you partway there, but they're scoped to the session. True enterprise role-based access control means scoping permissions at the individual tool and resource level. You need data masking for PII fields before agents see them. You need comprehensive logging—which agent called which tool, with what parameters, what data was returned. The EU AI Act and similar regulations will expect this level of transparency for autonomous systems.

The gap between "it works on my machine" and "it works for the organization" has always existed in software. With MCP, that gap is a chasm. The architecture that makes local development smooth—the thing that makes MCP feel magical for individual productivity—actively works against the constraints production environments require.

Which raises a question the Lenses engineers don't quite answer: is this design/security cliff a solvable engineering problem, or is it inherent to how we're building agentic systems? Are we asking too much of a protocol designed for local convenience, or do we just need better tooling to bridge the gap?

Right now, most teams are learning this the hard way—building for local, then rebuilding for production. The 50% of MCP servers still using long-lived API keys suggest we're still figuring out what production-grade even means.

Marcus Chen-Ramirez is a senior technology correspondent for Buzzrag, covering AI and software development.

Watch the Original Video

Your Insecure MCP Server Won't Survive Production — Tun Shwe, Lenses

Your Insecure MCP Server Won't Survive Production — Tun Shwe, Lenses

AI Engineer

24m 34s
Watch on YouTube

About This Source

AI Engineer

AI Engineer

AI Engineer is a prominent YouTube channel dedicated to advancing the knowledge and skills of AI professionals through insightful talks, workshops, and training sessions. Since its inception in December 2025, the channel has garnered over 317,000 subscribers, becoming an essential hub for those engaged in the field of artificial intelligence.

Read full source profile

More Like This

Related Topics