AI Just Found 500 Zero-Day Bugs. Now It's Writing

Here's something that should make you sit up: Anthropic's Claude model just found 500 zero-day vulnerabilities in open-source software. Twenty-two of them were in Firefox. Fourteen were high severity. And then—this is the part that gets interesting—the AI wrote working exploits for them.

Not theoretical exploits. Not proof-of-concept sketches. Actual, functioning code that could compromise a browser.

The technical YouTuber behind Low Level walked through what happened, and honestly? The details matter here because they complicate the narrative in ways that make this whole thing more fascinating than scary.

The Bug That Wasn't (And The One That Was)

First, the reality check: finding a vulnerability isn't the same as finding an exploitable vulnerability. And Claude's batting average isn't perfect.

Low Level points to an OpenSC case where Claude flagged what looked like a classic buffer overflow—multiple strcat operations that don't check string length before concatenating. Textbook unsafe C code. Except when you actually read the implementation, there's a hardcoded 64-byte limit that makes the whole thing safe. "There is no vulnerability here," Low Level notes. The OpenSC maintainers ended up switching to strlcat anyway, but more as defensive hygiene than fixing an actual bug.

This matters because it surfaces a tension in AI security research: pattern recognition without full context can flag problems that aren't problems. The model sees strcat and rings alarm bells because strcat is often dangerous. But code doesn't exist in a vacuum.

That said—and this is where it gets wild—Claude did find a real vulnerability in Firefox. A complex, stateful bug in how WebAssembly bindings interact with JavaScript. The kind of thing that's genuinely difficult to discover through traditional methods.

When AI Does Exploit Development

Here's what Anthropic actually did: they gave Claude a virtual machine, a task verifier, and 350 attempts to write a working exploit. Total token cost? About $4,000.

For context, a professional exploit developer might make $200,000 annually. This AI did comparable work—on a specific, admittedly constrained task—for the cost of a decent used car.

The Firefox bug involved a use-after-free condition in WebAssembly's JavaScript bindings. As Low Level explains, "from a pure static analysis perspective... this would be very hard to find. And also because it's a weird binding process between web assembly and JavaScript, it's also very hard to fuzz this."

Fuzzing—throwing malformed input at software to find crashes—struggles with grammar-based languages. WebAssembly is grammar-based. So is JavaScript. Finding a stateful bug between two grammars? That's security researcher nightmare difficulty.

But LLMs are weirdly good at reasoning about state when you give them bounded scope. Claude walked through the entire exploit development process: creating an address leak primitive to defeat ASLR (address space layout randomization), forging JavaScript objects at arbitrary addresses, building read and write primitives. The stuff you'd see in a professional exploit chain.

Low Level is clear-eyed about the limitations: "they were not able to break out of the browser sandbox." Getting code execution inside Firefox is one thing. Actually touching the host system requires breaking through another layer of defense. But still. We're watching AI do graduate-level exploit engineering.

The Velocity Problem

The CyberGym project at UC Berkeley benchmarks AI models against 1,500 known vulnerabilities across 188 projects. In May 2024, GPT-4 hit a 7.4% success rate at reproducing target vulnerabilities. Less than a year later, Claude Opus 4.6 is at 66.6%.

That's not incremental improvement. That's a phase shift.

And then there's the malware angle. An Advanced Persistent Threat group (Low Level thinks Russian, but isn't certain) was caught using what researchers are calling "vibe-coded malware"—AI-generated attack tools. The APT can iterate endlessly: change the process injection technique, rewrite the obfuscation algorithm, switch from Rust to Nim to Crystal (a language Low Level hadn't even heard of).

As Low Level puts it: "The ability to just arbitrarily change this under the hood and produce new malware basically for free creates a very difficult problem for defenders."

Industry types coined a term for this: "Distributed Denial of Detection." Which... okay, that's very cybersecurity-industrial-complex. But the concept is real—signature-based defenses struggle when attackers can generate infinite variations.

What This Actually Means

Low Level's advice for regular humans is practical: defense in depth, multi-factor authentication, hardware security keys (he uses YubiKeys daily), keep Windows Defender updated if you're on Windows. Nothing revolutionary, but worth repeating because it's still true.

For security researchers, his take is more nuanced. He's used AI for reverse engineering work himself (on Fortinet firmware, though he's still figuring out how to publish those findings). The tech is legitimately useful. But—and this is critical—you have to verify everything.

"If you do RE with an AI, make sure that you're actually testing it so that you're not causing people to remove their projects from HackerOne because of how much BS they got," he warns.

Which gets at something important that often gets lost in AI security coverage: this isn't about whether AI can find bugs. It demonstrably can. The questions are about false positive rates, about verification workflows, about what happens when everyone has access to exploit-writing capability.

Low Level frames this as "security researchers doing security research alongside an agent that may be working at the same speed if not faster than the researcher themselves." Not replacement. Augmentation. But augmentation that fundamentally changes the economics and timeline of vulnerability research.

We're not in the future yet where AI autonomously pwns the internet. But we're definitely past the point where AI security research is theoretical. The Firefox exploit exists. The malware variations exist. The 66.6% success rate exists.

The question isn't whether this is coming. It's already here. What we do with that information—how defenders adapt, how researchers verify, how the broader security ecosystem adjusts—that's the actual conversation worth having.

—Zara Chen