Visual Plans for Claude Code Change Agent Reviews

You know the moment. Claude Code finishes its planning pass, dumps a wall of markdown into your terminal, and you scroll through it nodding along like you understood every word — and then two hours later the agent hands you something that is technically what you asked for and completely not what you meant. The text sounded fine. The implementation was not fine. That gap between "this reads correctly" and "this is what I wanted" is the gap that eats afternoons.

Steve Sewell, founder of Builder.io, has been living in that gap and apparently decided to build his way out of it.

His new open-source tool introduces two skills for Claude Code: /visual-plan and /visual-recap. The pitch is straightforward — instead of generating plans as markdown prose that you skim and half-absorb, generate them as MDX (think: Markdown that can embed React components) with interactive diagrams, annotated code, API specs you can click through, and pan-and-zoomable wireframes. Before the agent writes a single line of code, you're looking at an actual wireframe of the UI it's about to build. You can react to that wireframe. You can say "no, the nav goes here, not there" before it's nav-goes-there code that has to be unpicked.

"I found this to be a much more intuitive interface for me to reason about what the agent's doing," Sewell says in the video. "It's really made me feel like humans and engineering is kind of entering this new abstraction phase where we reason about things at the plan level."

The visual-recap skill runs the same logic in reverse: once the agent finishes, instead of reading a terse summary or diving into raw diffs line by line, you get a visual recap — wireframes of what was built, interactive API specs, schema changes, annotated code. The same idea, different moment in the workflow. Catch misalignment before the work starts; catch missed requirements before it ships.

Why MDX specifically, and why does it matter

MDX is doing real work here, not just aesthetic work. Sewell tried HTML first — there's a post circulating about HTML being superior to markdown for AI-generated output — but HTML has problems as a repo artifact. It's verbose, it looks like garbage when you check it in, and crucially, it's inconsistent: every generation produces slightly different HTML soup. Switch models or agents and you get different soup.

MDX threads the needle. It's readable as a raw file (closer to markdown than HTML), it supports reusable React components (so your <APISpec> component looks the same every time, regardless of which model generated the MDX), and it's actually designed to be checked into version control. The consistency argument is sharper than it sounds — if your visual output format is stable across model changes, you can actually build tooling on top of it, which is exactly what Sewell did.

He built a GitHub Action that runs on every pull request and drops a visual plan snapshot into the PR comments automatically. The interactivity of the full MDX-rendered view depends on the specific components being rendered in a supporting environment (Sewell has open-sourced the MDX editor application at github.com/BuilderIO/agent-native), but even as a snapshot in a PR comment, you're getting diagrams and visual structure rather than prose. That's not nothing for code review.

The compiler analogy — ambitious but worth taking seriously

The big swing in Sewell's framing is the C-compiler comparison. He argues that engineers are moving toward reasoning at the "plan level" the same way they once moved from assembly to C: you trust the lower-level translation to happen correctly, and you focus your attention on the layer where your thinking actually lives.

"Almost to the degree to which we trust the C compiler to compile to assembly reliably. As long as the plan is good, and we make the plan clear, consumable, like easy to understand, easy for people to reason about, share, comment, etc., more and more we can trust the agents to implement it as expected."

I find this analogy genuinely interesting and also genuinely premature, and I think both of those things can be true at once. C compilers are deterministic, formally specified, and have been battle-tested for decades. AI coding agents are none of those things — they're powerful and getting more reliable, but "getting more reliable" is not the same as "reliable enough to trust the way you trust a compiler." The gap between those two things is where a lot of production incidents live right now.

That said — the direction of the analogy holds even if the destination isn't here yet. The developer workflow is unambiguously moving toward higher abstraction. More decisions made at the spec level, fewer at the implementation level. Tools that make plans legible and shareable are load-bearing infrastructure for that shift, not just nice UX. The debate isn't really about whether this abstraction is happening; it's about how fast and with how much guardrail.

What's actually novel here

The wireframes-before-code loop is the thing I keep coming back to. Not because it's technically unprecedented — design tools and low-fi prototyping have existed forever — but because embedding that loop inside the agent workflow changes where the feedback happens. Right now, most people review AI-generated code at the code level, which means you're reading React and Tailwind and trying to simulate in your head what it will look like. That's exactly backwards from how we'd evaluate any other design decision. You wouldn't approve a product design by reading CSS. A wireframe-first plan closes that loop before the tokens are spent on implementation.

Sewell puts it plainly: "Maybe the text sounded fine, but the wireframe makes me realize, 'Oh, wait. No, that's not what I had in mind.'"

The visual-recap layer extends this to post-implementation review in a way that's actually useful for teams. If your teammate's PR was generated by an agent, you reviewing the code line by line tells you what was written, not necessarily what was intended. A visual recap that shows what was built — the API shape, the UI structure, the schema changes — surfaces the intent alongside the implementation. That's a different kind of review than git diff.

A few things worth watching

The video title references "Claude Code + Codex." A note on that: Codex was an earlier OpenAI model that's largely been superseded. Whether these skills connect to OpenAI's current coding tools or are primarily Claude Code-native is worth checking in the skills repo before you build workflows around multi-agent assumptions.

The interactivity claims also deserve calibration. MDX rendered in a supporting environment with the right React components can be highly interactive — pan-and-zoom wireframes, clickable API specs. But that experience requires the full rendering setup Sewell has built. A snapshot in a GitHub PR comment is going to be more static by nature. Both are improvements over raw markdown; they're just different improvements.

None of this is a reason not to try it. Everything Sewell has built — the skills, the MDX editor application — is open source, free, and installable via his CLI. The feedback loop he's asking for is real: does the idea of reasoning at the plan level resonate with how you actually work?

For me, watching this, the honest answer is yes — and the wireframe-before-code piece is the part I'd want to get my hands on first. The wall-of-markdown problem is real and the fix of "make it interactive and visual" seems obvious in retrospect, which is usually a sign someone actually solved something rather than just shipped a feature.

The compiler analogy might be five years early. The tool looks useful right now.

Yuki Okonkwo is the AI & Machine Learning Correspondent at Buzzrag.