The Dry Run Workflow: Teaching AI Agents New

There's something satisfying about watching someone discover a workflow that just works. Not in a "wow, cool demo" way, but in the "I'm actually going to use this at my job" way. That's what happened when developer ZazenCodes stumbled onto what he calls the "dry run workflow" for creating AI agent skills—essentially teaching your coding agent to remember and replicate tasks you do repeatedly.

The setup is deceptively simple: do a task manually with your AI agent, then ask it to package everything you just did into a reusable skill. That's it. No prompt engineering gymnastics, no complex configuration files. Just work through something once, then tell Claude (or your coding agent of choice) to remember how you did it.

What Even Are Agent Skills?

Before we get into the workflow itself, it helps to know what we're building. Agent skills are basically instruction bundles—markdown files that tell an AI agent exactly how to do a specific task. Think of them as detailed recipes that your coding agent can reference later.

A skill file typically includes YAML front matter (name, description, when to use it) and then a step-by-step procedure. Some skills are just that single markdown file. Others include supporting scripts or documentation. The whole package lives in a folder, and your agent can pull it up whenever it recognizes a matching task.

ZazenCodes credits the workflow to John Grozinger, who discussed it on a Tesla podcast in February. The idea: why write these skills from scratch when you could just... do the thing, and let the AI watch and learn?

The Workflow in Action: Cleaning Up Messy Files

The first demo shows this with something universally relatable: a disaster of a file system. You know the type—project_final.doc, project_final_FINAL.doc, project_final_ACTUAL.doc. ZazenCodes creates a test folder full of this chaos, then opens Claude and just... starts working.

"We're going to create a skill for cleaning up a messy file system. Let's do it together in this chat," he tells Claude. Then he gives it specific instructions: lowercase and slugify all filenames, append the date created, and generate an index document listing everything.

Claude starts running bash commands. It fetches file metadata, renames everything in one shot, creates that index document. The messy files become clean: 2024-01-15_budget_report.xlsx, 2024-02-03_team_meeting_notes.md. Organized, timestamped, searchable.

Then comes the magic: "Create an agent skill for the workflow above."

Claude reads its own chat history, sees what it just did, and packages the entire sequence into a skill file. No manual documentation required. The conversation is the documentation.

Testing Reusability (The Part That Actually Matters)

A skill you can't reuse isn't a skill—it's just documentation with delusions of grandeur. So ZazenCodes creates a fresh mess (a folder of vegan ice cream recipes, naturally) and opens a brand new Claude session with no context from the previous conversation.

He types \fs and Claude autocompletes to the filesystem cleanup skill. One confirmation later, Claude is running the exact same workflow on completely different files. Same naming conventions, same index generation, same cleanup logic. It cost 30 cents to run.

"This is like my favorite part," ZazenCodes says, looking at the auto-generated index. "How we just get it to chunk out this file here. I love that."

The enthusiasm is genuine, and it should be—this is the moment where something shifts from "neat trick" to "actual tool."

Dockerizing With Customization Baked In

The second demo gets more technical: dockerizing a Python application that displays Japanese stories. This time, ZazenCodes uses Anthropic's official Skill Creator plugin (50,000 installs, apparently a thing people actually use).

He has Claude dockerize the app, but as it works, he realizes he has opinions about how this should be done. He doesn't want EXPOSE 8000 in the Dockerfile (it's just documentation, doesn't actually do anything). He wants his web servers running on a different port every time.

"The reason you want the skill is so you can customize the workflow," he explains. This is the actual value proposition—not just automation, but automation that remembers your preferences.

After dockerizing the app with his specific requirements, he asks the Skill Creator to bundle everything. When he tests it on a fresh calculator API project, Claude applies all his customizations automatically. No EXPOSE directive, port configured exactly how he wants it.

He mentions he's already using this at work, having recently created a skill for deploying Azure Functions. "I needed to do it repeatedly," he says simply. The workflow earned its keep.

What Makes This Actually Work

The dry run workflow succeeds because it inverts the usual process. Normally, you'd think through a task, abstract it, write documentation, hope you got it right. Here, you just do the thing, then extract the pattern from what you actually did.

It's learning by demonstration, but the student is an AI that can perfectly recall every command you ran. There's no ambiguity about "what I meant"—Claude has the entire transcript of what happened.

The workflow has obvious sweet spots: repetitive tasks with consistent structure, things where you have preferences that matter, automation you'll need multiple times but not enough to justify building proper tooling. Deploy a certain type of cloud function. Clean up datasets. Set up development environments.

What it's not great for: truly one-off tasks, things that change significantly each time, anything where the decision-making process is more important than the execution steps.

The File Storage Thing Is Weird Though

One quirk ZazenCodes discovers: when using Anthropic's official Skill Creator plugin, the created skill doesn't live in the expected location. Instead of the standard claude/skills directory, it ends up in plugins/cache/cla_code or somewhere equally obscure.

"That's so crazy to me," he says, genuinely surprised when he finds it.

The skill still works—Claude can find and use it—but the implementation detail suggests this ecosystem is still finding its footing. Agent skills aren't standardized yet. Different tools handle them differently. The territory is still being mapped.

Which might actually be the best time to explore it. The patterns that emerge now, the workflows that prove useful in practice rather than theory, those will shape how this develops.

ZazenCodes signs off encouraging viewers to think about their own repetitive tasks, the things they're doing over and over. Not everything needs to be a skill. But some things probably should be.

Yuki Okonkwo