AI Coding Tools Work Best With Old Engineering Practices
Developer educator Matt Pocock argues AI coding assistants amplify code quality issues. His solution? Decades-old software fundamentals matter more than ever.
Written by AI. Dev Kapoor
April 24, 2026

Photo: AI Engineer / YouTube
There's a movement happening in AI coding circles called "specs to code." The pitch is elegant: write a specification, feed it to an AI, get code out. If something breaks, don't touch the code—just modify the spec and run it again. It's compiler thinking applied to AI assistants, and according to developer educator Matt Pocock, it produces garbage.
Pocock has spent the last 18 months teaching developers to build with AI coding tools like Claude Code. He's watched hundreds of engineers hit the same wall: the first AI-generated code works okay, the second iteration works worse, and by the third pass, you're staring at an unmaintainable mess. The specs-to-code approach, he argues in a recent talk at the AI Engineer conference, is "just V coding by another name"—the fantasy that you can delegate everything and never think about the actual structure of your system.
The problem isn't the AI. It's what Pocock calls "software entropy"—the tendency of codebases to decay with every change unless someone actively invests in the design. When you treat code as disposable output from a spec compiler, nobody's investing. The AI doesn't understand system-level design. It's making tactical changes without strategic context. And entropy accelerates.
The Fundamental Misconception
Underlying the specs-to-code movement is an assumption: code is cheap. If the AI can generate unlimited code, who cares if it's messy? Just regenerate it.
Pocock's counterargument is stark: "Bad code is the most expensive it's ever been." Not because AI makes bad code worse—though it can—but because a hard-to-change codebase blocks you from extracting AI's actual value. AI coding assistants excel in well-structured codebases with clear boundaries and good tests. They flounder in spaghetti. The worse your architecture, the less AI can help you.
This inverts the conventional AI narrative. Instead of AI making engineering practices obsolete, it makes them load-bearing. You need better fundamentals to use AI well, not fewer.
Failure Mode One: The AI Didn't Understand
The first breakdown Pocock sees is miscommunication. You think you explained what you want. The AI thinks it understands. You both discover you were wrong after the code ships.
He borrows a concept from Frederick P. Brooks: the "design concept," an invisible shared understanding that exists between collaborators. You can't put it in a markdown file. It's the theory of what you're building, and it has to be genuinely mutual.
Pocock's solution is a prompt he calls "Grill Me": Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one by one.
The prompt went viral—13,000 GitHub stars. Developers report the AI asking 40, 60, sometimes 100 questions before it's satisfied. It transforms the interaction from eager code generation into adversarial requirements gathering. The result isn't just better alignment—it's a conversation artifact you can convert into a product requirements document or task list.
Pocock thinks this beats the default "plan mode" in most AI coding tools, which are, in his words, "extremely eager to create an asset" and start coding before genuine understanding exists.
Failure Mode Two: The AI Is Too Verbose
The second breakdown is language mismatch. The AI uses terms you don't recognize. You use terms it interprets differently. Everyone's talking past each other, and the code reflects it.
This is a classic problem in software development—the gap between domain experts and engineers. Domain-driven design solved it decades ago with "ubiquitous language": a shared vocabulary derived from the problem domain, used consistently in code, documentation, and conversation.
Pocock built a skill that scans codebases, extracts terminology, and generates a ubiquitous language document—markdown tables of terms the developer and AI both reference. He keeps it open during planning sessions. By reading the AI's thinking traces (when available), he's noticed it not only improves planning but makes the AI think less verbosely. Implementation aligns better with intent.
It's Domain-Driven Design, but the "domain expert" is a large language model.
Failure Mode Three: The AI Built It, But It Doesn't Work
The third breakdown is correctness. Even when aligned and communicating clearly, the AI ships broken code.
The obvious fix is feedback loops: static types, automated tests, browser access for frontend work. But Pocock noticed something: LLMs don't use feedback loops efficiently. They're bad at incremental development. They generate huge chunks of code, then type-check afterward, like a student who writes the whole essay before proofreading.
The Pragmatic Programmer calls this "outrunning your headlights"—driving faster than your feedback mechanism allows. Pocock's solution is test-driven development. Write the test first, make it pass, refactor. TDD forces the AI into small, deliberate steps.
But TDD only works if your codebase is testable, which brings us to the structural question: what does a testable codebase look like?
Deep Modules vs. Shallow Modules
John Ousterhout's A Philosophy of Software Design distinguishes between deep and shallow modules. Deep modules hide complexity behind simple interfaces—lots of functionality, minimal surface area. Shallow modules expose complex interfaces for minimal functionality.
AI-generated codebases, left unchecked, tend toward shallow modules. Tons of tiny files with complex interdependencies. The AI struggles to navigate these. It can't keep the dependency graph in context. It doesn't understand what the code does because it's scattered across fifty shallow modules.
Deep modules change this. They create clear boundaries with simple interfaces. You test at the interface. The implementation can stay messy—or you can let the AI handle it—because the boundary is solid.
Pocock has a skill for this too: "Improve codebase architecture." It explores the codebase, identifies related code, and wraps it in deep modules. The result is a codebase that rewards TDD and lets the AI understand system structure.
The Strategic/Tactical Division
The pattern across all these failure modes is the same: AI is a tactical programmer. It's the sergeant on the ground making changes. But somebody needs to think strategically—about design, boundaries, interface contracts, module architecture.
That's the human role. Not writing every line of code. Designing the interfaces. Investing in system design daily, as Kent Beck advises. Using AI as an implementation engine within a structure you control.
Pocock frames it as "design the interface, delegate the implementation." You don't review every line inside a well-tested module. You treat it as a gray box—verify the boundary works, move on. This scales your brain because you're not trying to hold the entire implementation in your head.
But it requires knowing your module map cold. It requires thinking about interfaces during planning. It requires, in other words, the fundamentals: clean architecture, ubiquitous language, test-driven development, deliberate design.
Why This Matters For OSS
From a community perspective, this has implications beyond individual productivity. If AI coding tools amplify existing code quality—making good codebases better and bad codebases worse—then OSS projects with poor architecture are about to get much harder to maintain.
We're already seeing maintainer burnout accelerate. Adding AI to poorly structured projects won't fix that. It might make it worse, as contributors use AI to ship faster without investing in design, accelerating entropy.
The projects that will thrive are the ones that enforce architectural standards, maintain clear module boundaries, require tests, and document their ubiquitous language. The fundamentals aren't just individual best practices anymore—they're community survival strategies.
Pocock's message is reassuring and demanding in equal measure: your existing skills aren't obsolete. They're more important. But you have to actually use them. The AI won't do it for you. It can't. That's the job.
—Dev Kapoor
We Watch Tech YouTube So You Don't Have To
Get the week's best tech insights, summarized and delivered to your inbox. No fluff, no spam.
Watch the Original Video
It Ain't Broke: Why Software Fundamentals Matter More Than Ever — Matt Pocock, AI Hero @mattpocockuk
AI Engineer
18m 26sAbout This Source
AI Engineer
AI Engineer is a rapidly growing YouTube channel dedicated to the professional development of AI engineers. Since its inception in December 2025, the channel has amassed over 317,000 subscribers by offering a diverse array of content including talks, workshops, events, and training sessions designed to expand the skill sets of AI enthusiasts and professionals alike.
Read full source profileMore Like This
AI's Impact on Coding Skills: A 17% Decline?
Anthropic's study reveals AI hinders coding mastery by 17%. Explore the implications on skill development.
Claude Mythos Found Zero-Days in Minutes. Your Stack Next?
Anthropic's leaked Claude Mythos model found zero-day vulnerabilities in Ghost within minutes. Security researchers call it 'terrifyingly good.'
Decoding the Fastest Machines for Token Generation
Exploring GPU performance in generating 1M tokens and energy efficiency.
Enhancing GLM-4.7: Transforming an Open Model into a Coding Powerhouse
Boost GLM-4.7's coding prowess with strategic prompts for backend logic and frontend design.
OpenAI's GPT-5.5 Claims Speed Crown—But Costs 20% More
GPT-5.5 promises faster AI coding with fewer tokens, but WorldofAI's tests reveal where it excels—and where it disappoints at premium pricing.
The $401M App Built by One Guy Who Can't Code
AI coding tools are enabling non-developers to build million-dollar apps. Here's what's actually working—and what the success stories aren't telling you.
Quinn 3 TTS: The Open Source Voice Cloning Dilemma
Exploring the rise of Quinn 3 TTS, an open-source voice cloning tool, and its implications for ethics and governance in tech.
AI Ads and Claude Code: Navigating the New Frontier
Explore AI ads in ChatGPT and Claude Code's impact on software development, governance, and user trust.
RAG·vector embedding
2026-04-24This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.