Why Your MCP Server Won't Survive Production
Most MCP servers collapse under real workloads. Lenses engineers explain the security cliff between local dev and production—and how to cross it.
Written by AI. Marcus Chen-Ramirez
April 9, 2026

Photo: AI Engineer / YouTube
There's a moment most teams building with the Model Context Protocol hit where everything changes. You've got your MCP server running beautifully on your laptop—local process, single user, no network exposure. It works so well you start imagining it at scale. Then you try to deploy it.
Tun Shwe and Jeremy Frenay from Lenses.io call this the "security cliff." One moment you're in a walled garden with zero security surface. The next, you need OAuth, token management, CORS configuration, TLS, rate limiting—all at once. There's no gradual on-ramp because you can't do "a little bit of production." You're either behind the wall or standing in the open.
The brutal part? Most MCP servers aren't designed to make this jump. Stack Lock's load tests tell the story: 20 out of 22 requests failed with just 20 simultaneous connections over standard IO transport. The architecture that works for one developer on one laptop collapses the moment you add concurrency.
The Security Shadow Problem
Shwe frames the issue through what he calls "security shadows"—the vulnerabilities that emerge from how agents fundamentally differ from human users. Take discovery. When you use a new API, you scan the docs once, find what you need, never look again. An agent can't do that. Every connection means enumerating every tool and reading every description.
"Every one of those tool descriptions is a surface for tool poisoning," Shwe explains. "Attackers can embed hidden instructions inside descriptions that are invisible in the UI, but the model will follow them without question."
Then there's iteration. Your script fails, you run it again—takes a second. When an agent retries, it sends the full conversation history over the wire. Every retry becomes a potential data leak. Any sensitive information from previous tool calls gets broadcast again.
The context problem might be the most insidious. You have decades of experience and intuition. An agent has roughly 200,000 tokens. That's it. And if your MCP server dumps unfiltered data into that limited window—PII, credentials, internal identifiers—it's all one prompt injection away from exfiltration. OWASP lists this as number 10 in their MCP top 10: context injection and oversharing.
Five Principles That Actually Matter
Shwe's core argument: good MCP design and good MCP security are the same discipline. "If you get the design wrong, no amount of OAuth will save you," he says. His five principles give you protection against the OWASP MCP top 10 before you write a single line of authentication code.
Shrink the attack surface by design. Think in outcomes, not operations. Every tool you expose is a door. Don't give the agent access to delete users when it only needs to check an order. Consolidate related operations behind a single tool call with a well-defined outcome. Fewer doors, fewer locks.
Constrain inputs at the schema level. Accept top-level primitives like enums. Dictionaries are fine if they're not nested. Use typing libraries like Pydantic for strictness. The aim is rejecting free-form nested payloads to avoid command injection flaws, which almost always trace back to unconstrained string arguments passed to a shell, query engine, or API.
Treat documentation as a defensive layer. Tool poisoning works by embedding malicious instructions in tool descriptions that models execute without question. If your documentation is incomplete or ambiguous, an attacker-controlled tool description in a neighboring MCP server can shadow yours. Complete, unambiguous instructions crowd out the space poisoned servers try to fill.
Return only what the agent needs. Strip payloads to the minimum. If the agent doesn't need a piece of data for its immediate task, don't return it. Every piece of information you include is ammunition for prompt injection.
Minimize the blast radius. Scope permissions at the tool and resource level, not the session level. Use MCP's read-only annotation for non-destructive tools. Better yet: if a tool is truly read-only, consider making it an MCP resource instead.
The mindset shift here is crucial. You're building an interface, not a toolkit. An agent will use anything you provide with confidence. You have to provide the trust layer.
The OAuth Problem Nobody Wants to Solve
Once you've designed well and need to actually deploy, you face what Frenay calls "more than 10 specifications to implement" just for OAuth alone. The core flow, client discovery, metadata, token lifecycle management—it's a lot.
Most teams start with API keys. Local MCP servers run over standard IO with an API key stored as an environment variable. It works, sort of. The key is long-lived, rarely rotated, not scoped to specific actions. Even worse, these keys get shared across systems.
Remote MCP servers running over HTTP have the same problem at scale. The key sits in a config file, often isn't verified by the MCP server itself, just passed through to upstream APIs. This creates what Frenay calls a "confused deputy vulnerability"—malicious clients obtain authorization without proper user consent. And if you map that key to another credential for API access, you've created a single shared credential serving many users. If it leaks, everyone's compromised.
This approach still represents more than 50% of MCP servers in production today. But the ecosystem is moving toward short-lived, scoped tokens via OAuth 2.1.
DCR vs CIMD: The Registration Battle
Traditional OAuth assumes you know your clients upfront. Register them in a developer portal, get a client ID, move on. This breaks completely with MCP. Any client—Claude Desktop, Cursor, VS Code, a CLI tool, a random agent—can discover and connect to any MCP server at runtime. It's an unbounded number of clients connecting to an unbounded number of servers.
Dynamic Client Registration (DCR) was the first attempt at solving this. Clients self-register against the authorization server and get a new client ID on every registration. No pre-provisioning required. But DCR has problems. Every connection creates a new registration. Using Claude on Windows then macOS creates two distinct registrations. Worse, DCR is vulnerable to phishing—anyone can POST to the registration endpoint, including attackers. The server just trusts whatever metadata the client self-asserts. A malicious client can claim to be Claude, and there's no way to verify.
Client ID Metadata Document (CIMD), preferred since November 2025, fixes this. Instead of POSTing to a registration endpoint, the client owner exposes metadata at a public URL—their actual domain. When a client tries to authorize, it passes this URL as its unique ID. The authorization server fetches the metadata and registers the client.
The difference is meaningful. Proving you control https://claude.ai actually means something. The redirect URIs are explicitly bound to the client in its metadata document, making it harder for attackers to sneak in malicious callbacks. Authorization servers can selectively allow or deny clients. No growing database of registrations to maintain.
"CIMD is a leap forward," Frenay says. But even CIMD isn't enterprise-grade on its own.
What Enterprise Actually Requires
OAuth scopes get you partway there, but they're scoped to the session. True enterprise role-based access control means scoping permissions at the individual tool and resource level. You need data masking for PII fields before agents see them. You need comprehensive logging—which agent called which tool, with what parameters, what data was returned. The EU AI Act and similar regulations will expect this level of transparency for autonomous systems.
The gap between "it works on my machine" and "it works for the organization" has always existed in software. With MCP, that gap is a chasm. The architecture that makes local development smooth—the thing that makes MCP feel magical for individual productivity—actively works against the constraints production environments require.
Which raises a question the Lenses engineers don't quite answer: is this design/security cliff a solvable engineering problem, or is it inherent to how we're building agentic systems? Are we asking too much of a protocol designed for local convenience, or do we just need better tooling to bridge the gap?
Right now, most teams are learning this the hard way—building for local, then rebuilding for production. The 50% of MCP servers still using long-lived API keys suggest we're still figuring out what production-grade even means.
Marcus Chen-Ramirez is a senior technology correspondent for Buzzrag, covering AI and software development.
Watch the Original Video
Your Insecure MCP Server Won't Survive Production — Tun Shwe, Lenses
AI Engineer
24m 34sAbout This Source
AI Engineer
AI Engineer is a prominent YouTube channel dedicated to advancing the knowledge and skills of AI professionals through insightful talks, workshops, and training sessions. Since its inception in December 2025, the channel has garnered over 317,000 subscribers, becoming an essential hub for those engaged in the field of artificial intelligence.
Read full source profileMore Like This
Decoding Core Dumped: Insights from George's Q&A
Explore Core Dumped's George on video creation, programming, AI's role, and computer science learning. Discover insights for developers and tech enthusiasts.
At GTC 2026, the Real AI Story Was About People, Not Hype
GTC 2026 revealed working AI applications in robotics, biotech, and automation—not slop. The real tension? Management still doesn't understand the tech.
Transforming Unstructured Data with Docling: A Deep Dive
Explore how Docling converts unstructured data into AI-ready formats, enhancing RAG and AI agent performance.
Why Hackers Are Ditching Stolen Passwords for Apps
Public-facing app exploits surged 44% while credential theft dropped. IBM's new threat report reveals what's driving the shift—and why it matters.