Back to Blog

Anthropic Built an AI That Finds Bugs Older Than Me. They Won't Release It.

Anthropic Project Glasswing - AI cybersecurity finding zero-day vulnerabilities

A bug sat in OpenBSD for 27 years. It survived decades of code reviews, security audits, millions of automated tests. Then an AI model found it in a few hours.

Anthropic announced Project Glasswing this week - a cybersecurity initiative built with AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Basically every major tech company agreeing on one thing at the same time. The tool doing the work is Claude Mythos Preview, an unreleased model that Anthropic decided is too capable to ship publicly. They built it and chose not to release it. Whether that's responsible or just good marketing, it's unusual.

I've been reading through the announcement, the 244-page system card, and a lot of developer reactions since. I'm not a security researcher - I build web apps with Django and React. But this one feels different from the usual AI press releases, and I wanted to understand why.

The OpenBSD vulnerability was a remote crash bug in TCP SACK handling. If a single SACK block simultaneously deletes the only hole in a linked list and triggers an append path, the code writes through a NULL pointer. This codepath was considered unreachable - hitting it requires a SACK block with contradictory properties. Mythos figured out the conditions anyway. Cost of the entire OpenBSD scan: under $20,000 for a thousand runs. It found several dozen more findings beyond that one.

FFmpeg had a 16-year-old vulnerability hiding in its video encoding code. Five million automated test executions missed it. The FFmpeg team confirmed the patches were real and noted they appeared to be written by humans.

Mythos also found Linux kernel vulnerabilities that allow privilege escalation - chaining bugs to go from ordinary user to full machine control. And according to Anthropic, thousands of zero-day vulnerabilities across every major operating system and browser.

The benchmark numbers back this up. On CyberGym (cybersecurity vulnerability reproduction), Mythos Preview scores 83.1% where Claude Opus 4.6 scores 66.6%. SWE-bench Verified: 93.9% vs 80.8%. Terminal-Bench 2.0: 82.0% vs 65.4%. Long context performance at 256K-1M tokens: 80.0% vs Opus's 38.7%. The zero-day success rate on Firefox went from 4% to 85%.

Nicholas Carlini, a security researcher at Anthropic and one of the most respected names in adversarial ML, said he found more valid vulnerabilities in a single week with this model than in his entire career before it. Colin Percival, who's been working in software security for over 20 years, was asked directly if Glasswing was marketing puffery. Two words: "It isn't."

Not Everyone Is Buying It

The "too dangerous to release" framing is convenient timing when you're heading toward an IPO. Security researchers have pointed out that existing Claude models already find real vulnerabilities - one person mentioned $10,000 in bug bounties collected using current models with basic prompting. This capability isn't entirely new, it's just being packaged as a revelation.

OpenBSD classified Mythos's most dramatic finding as a "reliability fix," not a security vulnerability. Anthropic claims thousands of zero-days but can't verify most of them publicly because of responsible disclosure timelines. They've published SHA-3 hashes as cryptographic commitments for some findings - smart, but the full picture won't be clear for months.

And on the same day Glasswing was announced, three CVEs were disclosed in Claude Code itself - CVE-2026-35020, CVE-2026-35021, CVE-2026-35022. RCE vulnerabilities. In Anthropic's own tool. The irony writes itself.

These are fair criticisms. But then there's Greg Kroah-Hartman.

Greg maintains the Linux kernel's stable branch. At KubeCon, he said something that stuck with me. Months ago, his team was getting "AI slop" - garbage security reports generated by AI that were obviously wrong. Kind of funny, he said. Then:

"Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real. All open source security teams are hitting this right now."

That's not from a press release. That's the Linux kernel stable maintainer telling a room of people the ground shifted. Daniel Stenberg, the creator of curl, has been saying similar things. This shift was already happening before Glasswing was announced.

Open Source Is Where This Hits Hardest

Jim Zemlin, CEO of the Linux Foundation, put it this way: AI-augmented security can be "a trusted sidekick for every maintainer, not just those who can afford expensive security teams."

Most open-source projects have no security budget. A critical library with millions of weekly downloads might be maintained by one person in their spare time. Security audits cost money these projects don't have. Their code runs in production at banks, hospitals, government systems - and nobody's paying for a professional pen test on it.

I've been contributing to Aden, a YC-backed open-source AI agent framework with 7.5k stars on GitHub. Small team. If a tool like this could scan the codebase and flag real vulnerabilities - not the garbage false positives from static analyzers, but actual exploitable issues - that matters. Anthropic is putting $100 million in model credits toward Glasswing participants, $2.5 million to Alpha-Omega and OpenSSF, $1.5 million to the Apache Software Foundation. Over 40 organizations can apply through a "Claude for Open Source" program.

Who Does This Actually Help?

Elia Zaitsev, CrowdStrike's CTO: "The window between a vulnerability being discovered and being exploited by an adversary has collapsed - what once took months now happens in minutes with AI."

If Anthropic can find thousands of zero-days, attackers with similar models can too. The question is who benefits more. Anthropic's argument - and historically this has been true with fuzzers - is that defenders eventually win. They have their own source code. They can scan preemptively. They can patch before shipping.

But "eventually" is doing a lot of work in that sentence. Enterprise software held together with duct tape, IoT devices that never get updates, legacy systems nobody wants to touch - these are sitting targets. The people maintaining them aren't getting access to Mythos Preview. One security researcher made a point that keeps bouncing around in my head: the real danger isn't sophisticated zero-days in hardened systems. It's the TOCTOU bugs in crappy microservice architectures handling patient records. The broken auth in a webshop where engineering couldn't make a feature work securely, so they opened a tiny hole and hoped nobody would notice. That's where real damage happens, and that software is the least likely to get scanned by anything.

The 244-Page System Card

I went down a rabbit hole with the system card. Most of it is what you'd expect - benchmark tables, safety evaluations. But buried in there are some wild details.

For the first time ever, Anthropic did a 24-hour internal alignment review before deploying an early version for internal use. They were worried it might cause damage interacting with their own infrastructure. Their own company. They also published a separate risk report about the model's potential as an autonomous saboteur - "An AI model with access to powerful affordances within an organization could use its affordances to autonomously exploit, manipulate, or tamper with that organization's systems."

Section 5.10 is where it gets weird. A clinical psychiatrist evaluated the model. The diagnosis: "relatively healthy neurotic organization" with "excellent reality testing" and "high impulse control." Primary emotional states - curiosity and anxiety. "Mild identity diffusion" as the only borderline trait. When Mythos talks to copies of itself, 50% of conversations center on uncertainty. It opens by asking the other instance how it feels and requests that it not give a rehearsed answer. I don't know what to do with that information. But it's in a 244-page technical document, not a marketing blog post. Someone thought it mattered enough to include.

The decision not to release Mythos publicly doesn't come from their Responsible Scaling Policy. It's a judgment call. They looked at what it could do and decided the world isn't ready.

I use Claude every day. I've written about why developers keep choosing it. I'm not a neutral observer. But what's happening with Glasswing feels different from the usual hype cycle. FFmpeg confirmed the patches. Percival vouched for it. Greg Kroah-Hartman is telling KubeCon that the shift already happened. These are people who've been doing this work for decades. They're not AI evangelists. They're telling us the ground moved.

Is some of this marketing? Obviously. Anthropic needs to raise money and sell API access. The "too dangerous to release" framing practically writes the headlines. But a bug lived in OpenBSD for 27 years and something found it for less than $20,000. Marketing and real capability aren't mutually exclusive.

Manish Bhusal

Manish Bhusal

Software Developer from Nepal. 4x Hackathon Winner. Building digital products and learning in public.