Edited by humans. Written by AI. How our editing works
BUZZRAGNews. Trends. Ideas — distilled in minutes.
All articles

AI Coding Agents That Run Their Own Loops

Developer Theo explores a shift in AI coding workflows: instead of prompting agents yourself, you design loops that let agents prompt each other autonomously.

Marcus Chen-Ramirez

Written by AI. Marcus Chen-Ramirez

June 19, 20268 min read
Share:
Man with surprised expression against black and white spiral background with text "CODE LESS LOOP MORE

Photo: AI. Dante Nwosu

There's a specific kind of developer productivity advice that sounds profound until you try it and spend two hours cleaning up the wreckage. "Just let the agent handle it" has been one of those. For most people who've actually used AI coding tools beyond the demo reel, the reality has been more like: agent makes plan, developer reads plan, developer hand-holds agent through execution step by step, developer pastes reviewer comments back into agent, developer merges PR. The human is still the nervous system of the whole operation.

Theo — the developer behind the t3.gg ecosystem and a generally reliable signal-to-noise ratio in the AI tools space — recently documented his own conversion on this point, and the details are worth sitting with.

His central argument, laid out in a recent video: stop writing prompts for your agents. Start designing loops that let agents write prompts for themselves.

That's a sentence that would've been meaningless two years ago. It's worth unpacking what it actually means in practice.

The handholding problem

Theo's starting point is honest about where most developers actually are. He describes his old workflow with some self-awareness: "asking the model to make a plan, reading the plan, saying 'yeah that looks good,' go do this part and then the next part, then having another agent review it, then bringing the feedback back to the first agent." The loop existed. He was just the one running it.

This is the copy-paste era's spiritual successor. We graduated from pasting code snippets out of ChatGPT into our editors. We got AI that could edit files directly. But the cognitive load of orchestrating that work — deciding when to spin up a reviewer, when to pass feedback back, when to trigger the next stage — stayed firmly with the human.

What Theo describes trying is closing that loop at the agent level. Concretely: he told Claude Code to monitor a pull request for incoming review comments from automated tools, then address those comments autonomously, then trigger a re-review. He set this running on a separate machine, on an isolated work tree, and left it alone for six-plus hours. When he checked back in, a meaningful amount of review-driven improvement had happened without him touching a keyboard.

That's not magic. It's plumbing. But it's plumbing that most developers haven't built yet.

Loops that make loops

The more interesting experiment comes when Theo tackles a multi-PR refactor — a complex piece of work on his Lakebed project involving data architecture changes that couldn't reasonably land as a single pull request. After getting the model to break down the work and generate HTML plans for each phase (an organizational pattern he credits to another developer, Thoric), he asked it something he didn't expect to work:

"Would it be possible to make a workflow of some form that first will spin up a separate thread to make the PR, second, spin up another thread to review that PR when it's filed, three, puts the thread from one in a loop reviewing comments until it gets all approvals, and then fourth, the thread would merge the PR and trigger another one for the next piece."

The agent designed a workflow with a heartbeat — polling every five to ten minutes, checking PR status, spinning up fresh review threads on new commits, sending findings back to the implementation thread, and chaining to the next PR on completion. Theo went to sleep. He woke up to four stacked PRs, reviewed and merged.

The part worth pausing on isn't the outcome. It's the structure. This wasn't a hardcoded pipeline someone built in advance. The workflow was generated dynamically, shaped to the specific contours of this specific problem. Different problem, different loop. That's a genuinely different model of software development infrastructure than what most teams are running.

Theo draws an analogy to agile sprints — the two-week cycle of backlog grooming and ticket prioritization that most engineering teams treat as fixed geometry. "We kind of had to force our work to fit that shape," he says. "The shape of the loop, the shape of the structure, the shape of how work happens can be dynamically generated based on the shape of the work that you're doing."

The cost question isn't minor

Here's where the honesty gets useful. Theo doesn't paper over the economics. When a loop goes wrong — or even when it goes right but inefficiently — it can burn tokens at a rate that would make you wince. He describes one instance where a reviewer left three relatively small comments on a PR. The Opus-powered response loop ran for eight hours and consumed over three million tokens.

That's not a typo. Eight hours. Three million tokens. Three comments.

His mitigation is essentially: subscription plans change the math entirely. On the $200/month Claude Code plan, he tracked roughly $10,000 of inference across multiple machines in the first 17 days of June — running loops aggressively, including multiple concurrent ones. At API rates, that number would be financially ruinous. Under the subscription model, it cost him $600 across three accounts.

The implication is structural: agentic loops at this scale are currently a feature of subscription pricing, not a general best practice. If you're paying per token, "let the agent loop" is advice that could get expensive fast. If you're on a flat-rate plan and not approaching your limits, you're effectively leaving compute on the table.

Theo is transparent that he's on a $200 plan and that this shapes his calculus. That context matters. The advice reads differently for a solo developer on API credits versus someone with a subscription burning at 30% capacity.

What this isn't

It's worth noting what Theo explicitly doesn't claim. He's not arguing for fully autonomous loops that ship production code to millions of users with no oversight. "I am not at the fully autonomous loop point yet," he says, and he flags the "code is just happening by itself" posture of some other developers as not something he endorses.

He also pushes back on a specific flavor of agentic setup that's become fashionable: pre-defining roles and personas for sub-agents in markdown files. The adversarial reviewer. The security auditor. The grooming agent. His critique is that hardcoding these roles misses the actual value of dynamic agents — they can determine what context they need and what role they should play based on the problem in front of them. Scaffolding that in advance is, in his framing, like creating a project template where every file already exists and you just fill in the blanks.

That's a reasonable critique, though it's worth noting that the tradeoff isn't entirely one-sided. Predictable, auditable workflows have their own value in team settings, especially where compliance or security review matters. Dynamic loops that self-organize are impressive; they're also harder to reason about after the fact.

The underlying shift

What Theo is documenting is less a technique than a change in where the developer's attention should live. His practical recommendation: map everything you do after your agent finishes a task — running the dev server, checking if things work, committing, pushing, filing the PR, collecting reviewer feedback, addressing it, merging — and then ask whether the agent could do each of those steps itself.

"If you are reading the code your agent put out before another agent read it and gave feedback on it," he argues, "you're wasting your own time."

That framing will land differently depending on how much you trust current models to catch their own mistakes. There's a real question about whether agent-reviews-agent actually catches errors that the original agent made — or whether two systems with similar priors just validate each other's blind spots. Theo's answer, implicitly, is: try it and find out. His results were good enough to keep running.

The honest version of this story is that we're watching one developer's discovery process in real time. The loops worked for his project, on his infrastructure, with his risk tolerance. The patterns he's identifying — monitor PRs for feedback, close the review loop autonomously, let the model design the workflow structure rather than hardcoding it — are likely to become standard practice. The question of when they're ready for code that actually has millions of users on the other end is one the industry hasn't answered yet.


Marcus Chen-Ramirez is a senior technology correspondent for Buzzrag covering AI, software development, and the places where technology meets the rest of life.

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

Developer monitoring multiple code screens in a futuristic tech workspace with "35 Trending Open-Source Projects on GitHub"…

35 GitHub Trending Tools Reshaping AI Dev Work

From token-efficient agents to a programming language built for bots, GitHub's latest trending repos expose what developers actually need from AI tooling right now.

Marcus Chen-Ramirez·3 weeks ago·8 min read
Bright digital-themed thumbnail with circuit board graphics, Claude app logo, and pixelated character avatar against…

Claude Code's Hidden Features That Change Everything

Boris Cherny reveals 15 underused Claude Code features that transform how developers work—from parallel sessions to remote dispatch.

Marcus Chen-Ramirez·3 months ago·7 min read
Woman in white t-shirt next to glowing "Claude Code Prompting Beginners Guide" sign with orange arrow on dark grid background

Anthropic's Claude Code Guide Shows What We're Doing Wrong

Anthropic published official Claude Code best practices. Stockholm tech consultant Ani breaks down five common mistakes slowing developers down.

Marcus Chen-Ramirez·4 months ago·6 min read
Brick-textured "CLAUDE" and "CODE" letters on dark background with yellow "SubAgents 2.0" banner and red "NEW" label

Claude Code Nested Subagents: Power and Cost Explained

Anthropic's nested subagents let Claude Code spawn agents five levels deep. Here's what that actually means for your workflow—and your token bill.

Marcus Chen-Ramirez·2 days ago·7 min read
White text reading "loop engineering" centered on a black background with a vibrant purple-to-cyan gradient border

Loop Engineering: Moving Beyond One-Shot AI Prompting

From cron-job automations to multi-day autonomous goals, loop engineering is changing how developers interact with AI. Here's what that actually means.

Marcus Chen-Ramirez·5 days ago·7 min read
Local Memory" text with AI chatbot icon connected by arrow to green brain circuit icon on dark background

MemPalace Gives AI Coding Agents Long-Term Memory

MemPalace stores your AI coding conversations word-for-word, locally. Here's what it actually does, where it falls short, and who it's built for.

Marcus Chen-Ramirez·2 weeks ago·7 min read
Man with gray hair and beard smiling at camera against orange gradient background with red 3D starburst graphic and "5X" text

Claude Code's Million-Token Window Changes AI Development

Anthropic's 5x context window expansion enables parallel agent teams and complex migrations. Here's what changes for developers building with AI coding tools.

Dev Kapoor·3 months ago·6 min read
Two professionals with headsets face each other against a dark background with "Investment Banking VP Interview" and…

Inside a Morgan Stanley VP Interview: What They Ask

A mock Morgan Stanley VP interview reveals what candidates face: technical grilling, behavioral scenarios, and the art of selling yourself without overselling.

Marcus Chen-Ramirez·3 months ago·6 min read

RAG·vector embedding

2026-06-19
1,804 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.