That Agent.md File Might Be Making Your AI Worse

You know that Agent.md file sitting in your repo? The one everyone said you absolutely needed to make AI coding work properly? Yeah, turns out it might be actively making things worse.

A new study just dropped that benchmarked these context files across multiple AI models—Claude Sonnet 3.5, GPT-4, and others—and the results are... not great. When AI coding agents were given Agent.md or Claude.md files, they performed worse consistently. Not a little worse. Measurably worse.

This is wild because the entire AI dev community has been evangelizing these files for months. Everyone's publishing their custom rule sets, their skills files, their perfectly crafted context documents. Tool makers actively encourage it. And now we have data suggesting it's counterproductive.

What the Research Actually Found

The study tested three conditions: repos with developer-written context files, repos with those files removed, and repos where the AI generated its own context file before starting work.

The numbers are striking. Developer-provided files only improved performance by 4% on average compared to having nothing at all. AI-generated context files? They decreased performance by 3%. And here's the kicker: these context files increased costs by over 20% because they made the models do more exploration, testing, and reasoning—most of it unnecessary.

As Theo from t3.gg, who covered the study in depth, puts it: "We observed that the developer provided files only marginally improved performance compared to emitting them entirely, an increase of 4% on average, while the LLM generated context files had a small negative effect, a decrease of 3% on average."

The Pink Elephant Problem

The issue is context pollution. When you tell an AI model something, it thinks about that thing. Full stop. It's like the "don't think about pink elephants" problem—the moment you mention it, everyone's thinking about pink elephants.

Here's how it works: every time you send a message to an AI coding assistant, you're not just sending that message. You're sending the entire conversation history, plus the system prompt the company wrote, plus any developer messages like your Agent.md file. All of that gets processed on every single token generation.

If your Agent.md mentions you use tRPC somewhere in your stack—even if it's just legacy code you barely touch—the model is now biased toward reaching for tRPC. It's in the context. It's salient. The model will autocomplete toward it even when it doesn't make sense.

Theo discovered this in his own codebase: "Mentioning that we use TRPC on the back end is now going to bias it towards using TRPC even though we only use it for a handful of legacy functions. Almost everything is now on convex. Not only does it know we have TRPC, we actually put it in front of the convex part. So it is much more likely to reach for TRPC where it might not make sense."

What Actually Belongs in Agent.md?

The consensus emerging from both research and practice: if the model can find it easily, it doesn't belong in your context file.

Modern AI models are actually really good at navigating codebases. They've been trained (via reinforcement learning) to use grep, to search for strings, to trace dependencies. If you paste a screenshot of broken UI and say "fix this," the model will search for unique strings from that UI, ripgrep through your codebase until it finds the right component, verify nothing else depends on it, make the change, and tell you it's done.

No Agent.md required.

So what should go in these files? The study's recommendation is minimal: "only minimal requirements like specific tooling to use with the repository." Things the model genuinely can't infer from the codebase itself.

That might mean:

Non-obvious build commands or tooling requirements
Critical constraints the model keeps violating (and only after you've verified it's actually a pattern, not a one-off)
Workflow-specific rules that aren't encoded anywhere else

But here's the thing: the best time to update your Agent.md isn't at the start of a project. It's after you've noticed the model consistently making the same mistake despite having all the information it needs in the codebase.

The Meeting Room Analogy

Theo makes a comparison that really lands: "You know how much we all hate having endless meetings as developers? We don't need to know all of the intricate details of the five versions that the product and design went back and forth on before we have to go implement it. We're in the meeting anyways. Why the fuck do we think the AI likes it more than us?"

We're essentially giving AI models the equivalent of pointless context dumps before every task. Architecture overviews they can derive from imports. Commands they can find in package.json. Patterns they can infer from existing code.

All of that costs tokens. All of it gets processed. And most importantly: all of it distracts from what you actually want the model to do.

What This Means for Your Workflow

If you've got Agent.md files in your repos right now, you probably don't need to delete them immediately. But you should audit them. Ask: would the model find this information anyway? Is this actually helping, or just adding noise?

The study's conclusion is pretty clear: skip LLM-generated context files entirely. For manually written ones, use extreme restraint.

And maybe—this is the uncomfortable part—consider that we've been cargo-culting a practice that looked like best practice but was actually just... practice. Everyone was doing it, so it seemed smart. Now we have data suggesting otherwise.

The models are better at finding information than we gave them credit for. The context files we thought were helping were mostly just getting in the way. Sometimes the best prompt engineering is knowing when to say less.

—Zara Chen, Tech & Politics Correspondent