Claude Code Routines: AI That Audits Your Code

Anthropic just released a feature that lets Claude Code run security audits and make code improvements while your laptop is closed. It's called Claude Code Routines, and according to developer Leon van Zyl's demonstration, it found 75 security vulnerabilities in a simple to-do app—most of which the developer didn't know existed.

I've watched enough AI coding demos to know the difference between a carefully staged success and something potentially useful. This one sits somewhere in the middle, which makes it more interesting than either extreme.

What Actually Happens

Claude Code Routines are essentially scheduled prompts attached to your GitHub repository. You configure three things: a prompt (what you want Claude to do), a repo (where it should do it), and a trigger (when it should happen). The trigger can be a schedule (cron-style), an API call, or a GitHub event like a merged pull request.

The routines run in Anthropic's cloud, not on your machine. Close your laptop, go to bed, wake up to pull requests. The mechanics are straightforward—almost boring, which is usually a good sign in enterprise software.

Van Zyl demonstrated two routines. The first, an "auto improver," was instructed to "explore the code base and identify one meaningful improvement" every hour. The second ran OWASP Top 10 security audits on a schedule.

The Auto-Improvement Routine

The auto-improver found that van Zyl's to-do app lacked an edit function. You could create cards and delete them, but clicking on a card did nothing. To fix a typo, you'd have to delete and recreate the entire card.

Claude identified this gap, implemented the fix, and created a pull request—all automatically. Van Zyl merged it, refreshed the app, and the edit function worked. "I didn't even notice that," he said when Claude pointed out the missing feature.

This is either impressive or concerning, depending on your perspective. On one hand: free feature development while you sleep. On the other: do you really want an AI deciding what features your app needs?

The answer probably depends on your relationship with technical debt. If you're shipping fast and accumulating "we should fix that someday" issues, an AI that actually fixes them might be valuable. If you're carefully managing a product roadmap, autonomous feature addition sounds like chaos.

The Security Audit Found More Than Expected

The security routine is where things get interesting. Van Zyl intentionally hardcoded API keys in a route handler as a test case. He expected Claude to find that one obvious vulnerability.

Instead, the audit identified 75 critical issues in this "simple" application. The hardcoded keys were there, yes. But also: an authorization header that was read but never validated, missing CORS policies, and a potential SQL injection vulnerability in the analytics route.

"We thought the only issues were these API keys, right?" van Zyl said, scrolling through the audit report. "How scary is that?"

The routine is based on the OWASP Top 10—the industry-standard list of critical web application security risks. Van Zyl created a custom "skill" (Claude's term for reusable knowledge modules) that embeds all the OWASP documentation, prevention strategies, and example scenarios. Claude references this skill during the audit.

Here's what makes this genuinely useful rather than just clever: the audit generates a dated report in your repository, creates a pull request with fixes, and—if you configure it this way—automatically merges the changes. You can review first if you prefer, but van Zyl's logic for auto-merging makes sense: "If this picks up a security issue, I would want to get that fixed instead of there being a risk of this vulnerability being in the app for the next couple of hours and I get hacked."

The Questions This Raises

Watching the demo, I kept thinking about the 1990s promise of automated code generation. Remember CASE tools? They were going to let business analysts generate entire applications without programmers. Didn't work out that way.

But this isn't trying to replace programmers. It's automating code review and basic security hygiene—tasks that experienced developers know they should do more often but realistically don't. There's a difference between "AI will write your app" and "AI will catch the stupid mistake you made at 11 PM on a Friday."

The daily run limits suggest Anthropic is still figuring out the economics. Pro users get 5 runs per day, Max gets 15, Team and Enterprise get 25. That's enough to be useful but not enough to run a routine every hour on multiple projects. The pricing pressure is visible.

There's also the trust question. Van Zyl configured his security routine to auto-merge fixes. Would you? The answer probably correlates with your risk tolerance and your test coverage. If you've got comprehensive tests, auto-merging security fixes might be fine. If you don't, you're essentially letting an AI commit to production based on its own judgment.

The Part Nobody Wants to Talk About

The elephant in the room: this is Anthropic training Claude on your private codebase. The terms of service presumably address this, but I noticed van Zyl used a public demo repository for his tutorial. Would he connect his production code? The video doesn't say.

This matters more than the feature itself. Every AI coding tool faces the same tension—you want the AI to understand your specific codebase, but you're uneasy about sending your company's intellectual property to a third-party cloud service. Anthropic's enterprise customers presumably have contractual protections. Individual developers might want to read the fine print.

What This Actually Means

Claude Code Routines won't replace code review or security audits conducted by humans who understand your business context. But for the security issues that are objectively wrong—hardcoded credentials, unvalidated inputs, missing security headers—automated detection and fixing starts to make sense.

The auto-improvement routine is harder to evaluate. Sometimes you want an AI to notice the edit function you forgot to implement. Sometimes you want full control over your feature roadmap. The answer depends on your project, your team, and your tolerance for surprises.

What's clear is that the line between "AI assistant" and "AI colleague" keeps moving. We went from autocomplete to chat-based coding help to agents that commit code while you sleep. Each step seems reasonable in isolation, but the cumulative effect is that we're ceding more decisions to systems we don't fully understand.

Maybe that's fine. Or maybe we're just getting comfortable with it before we've really thought it through.

—Mike Sullivan, Technology Correspondent