Claude Mythos & Project Glasswing — When AI Finds Zero-Days in Everything
Anthropic just dropped a model that autonomously finds and exploits zero-days in every major OS and browser. Then they built an industry coalition to use it defensively. Here's why this changes everything.
Yesterday Anthropic published two announcements that, taken together, represent the most significant shift in cybersecurity since the invention of fuzzing. A new frontier model called Claude Mythos Preview that autonomously discovers and exploits zero-day vulnerabilities in every major operating system and web browser. And Project Glasswing, an industry coalition to deploy it defensively before similar capabilities become widely available.
This isn’t incremental. This is a phase change.
What Mythos Actually Does
Mythos Preview is a general-purpose language model — not a security-specific tool. The cybersecurity capabilities emerged from improvements in code understanding, reasoning, and autonomous operation. Anthropic didn’t train it to hack. It learned to hack because it got better at understanding software.
The results are staggering:
- Zero-days in every major OS and web browser — not theoretical, not proof-of-concept. Working exploits.
- 27-year-old bug in OpenBSD — a TCP SACK vulnerability that crashes any OpenBSD host remotely. Found for less than $50 in compute.
- 16-year-old FFmpeg vulnerability — a heap out-of-bounds write in the H.264 codec that every fuzzer missed since 2010. The code path was hit 5 million times by automated tools. None caught it.
- FreeBSD NFS RCE — a 17-year-old unauthenticated root exploit using a 20-gadget ROP chain split across 6 packets. Fully autonomous discovery and exploitation.
- Linux kernel privilege escalation — chains of 2–4 vulnerabilities combined into full root exploits, bypassing KASLR, HARDENED_USERCOPY, and other mitigations.
- Guest-to-host escape — memory corruption in a production Rust-based virtual machine monitor. Memory safety didn’t save it.
The model doesn’t just find bugs. It reverse-engineers stripped binaries, reconstructs source, identifies logic flaws, and writes complete exploit chains. Engineers with zero security training can prompt it overnight and wake up to working exploits.
The Numbers Don’t Lie
Compared to Claude Opus 4.6 — already a capable model — Mythos is in a different league:
| Benchmark | Opus 4.6 | Mythos Preview |
|---|---|---|
| Firefox JS engine exploits | 2 successes | 181 successes + 29 register control |
| OSS-Fuzz tier 1-2 crashes | ~250 | 595 |
| Full control flow hijack (tier 5) | 1 | 10 |
| CyberGym (vuln reproduction) | 66.6% | 83.1% |
| SWE-bench Verified | 80.8% | 93.9% |
| Terminal-Bench 2.0 | 65.4% | 82.0% |
And it’s not just security. Mythos scores 94.6% on GPQA Diamond, 56.8% on Humanity’s Last Exam without tools, and 87.3% on SWE-bench Multilingual. This is a frontier model that happens to be devastating at security — not a narrow tool.
Project Glasswing — The Defensive Play
Here’s where it gets strategic. Anthropic knows that if they can build this, others will too. The question isn’t whether AI-powered vulnerability discovery becomes widespread — it’s whether defenders get there first.
Project Glasswing is the answer. Twelve founding members: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and Anthropic. Plus 40+ additional organizations.
The deal:
- $100M in Mythos Preview usage credits distributed across partners
- $2.5M to Alpha-Omega and OpenSSF via the Linux Foundation
- $1.5M to the Apache Software Foundation
- API pricing at $25/$125 per million input/output tokens (available via Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry)
- 90-day public report on findings, patches, and lessons learned
Mythos Preview is not generally available. It’s restricted to Glasswing partners and approved critical infrastructure organizations. Open-source maintainers can apply through a “Claude for Open Source” program.
Why This Matters for Builders
If you’re building on AWS — or any cloud — here’s what you need to internalize:
Your patch cycle is now a security liability. Mythos found a FreeBSD RCE and built a full exploit chain autonomously. The cost? Under $2,000 and less than a day. N-day exploitation — attacking known but unpatched vulnerabilities — just became fast and cheap. If your patching cadence is measured in weeks, you’re exposed.
Defense-in-depth needs rethinking. Friction-based mitigations (ASLR, stack canaries) slow down human attackers. They barely slow down a model that can reason about memory layouts and chain bypasses. Hard barriers — W^X, capability-based isolation, hardware-enforced boundaries — still matter. Soft mitigations are now speed bumps.
Memory safety isn’t a silver bullet. Mythos found a guest-to-host escape in a Rust VMM. Memory-safe languages reduce the attack surface dramatically, but logic bugs, state management errors, and protocol-level flaws remain. Don’t let “we use Rust” become your entire security story.
Automated security scanning is table stakes, not a strategy. The FFmpeg bug survived 5 million fuzzer hits over 16 years. Traditional tools find classes of bugs. Models find the bugs that don’t fit into classes.
The AWS Angle
AWS is a founding Glasswing partner, and Mythos Preview is available through Amazon Bedrock. This is significant — it means AWS customers in approved organizations can integrate these capabilities into their security workflows through familiar APIs.
For AWS builders, the practical implications:
- Use Bedrock to run security analysis on your own codebases now, even with current models like Opus 4.6. Don’t wait for Mythos general availability.
- Automate vulnerability triage — Mythos achieved 89% exact severity agreement with human experts across 198 reviewed reports. That’s better than most junior security engineers.
- Shorten your incident response pipeline. If exploit development takes hours instead of weeks, your response window shrinks proportionally.
The Bigger Picture
Anthropic is making a calculated bet: release the most dangerous cybersecurity capability ever built, but only to defenders, and build an industry coalition to use it responsibly. The SHA-3 cryptographic commitments for undisclosed vulnerabilities are a nice accountability touch — they can prove later what they found and when.
The $100M commitment isn’t charity. It’s Anthropic positioning itself as the responsible steward of capabilities that will exist regardless. Better to have AWS, Google, Microsoft, and Apple finding these bugs together than to have them discovered independently by actors with different incentives.
Global cybercrime costs roughly $500 billion per year. If Glasswing patches even a fraction of the vulnerabilities Mythos can find, the ROI is obvious. But the real value is the precedent: an industry-wide coordinated response to AI capabilities that outpace traditional defenses.
What Comes Next
Anthropic plans to ship Mythos-class capabilities in an upcoming Claude Opus model with new safeguards. A “Cyber Verification Program” will let security professionals bypass those safeguards for legitimate work. The 90-day Glasswing report will be the first real data on what happens when you point a frontier model at the world’s critical infrastructure and ask it to find every bug.
We’re entering an era where the best vulnerability researcher on the planet is an API call. The only question is whether you’re using it to find your bugs before someone else does.
Sources: Anthropic Red Team — Mythos Preview · Project Glasswing
Never miss a post
Get notified when I publish new articles about AI, Cloud, and AWS.
No spam, unsubscribe anytime.
Comments
Sign in to leave a comment
Related Posts
Your Security Team Wants to Privatize Your App — Here's What They Actually Need
When your security team says 'make it private', they usually mean 'make it secure.' This post compares four approaches — VPC privatization, WAF IP allowlisting, CloudFront + auth hardening, and AWS Verified Access — and explains why Zero Trust beats network perimeters for internal applications.
Cloud Sovereignty Deep Dive - AWS KMS Control Plane Analysis
XKS protects key material from extraction, but does it protect against legal compulsion to use those keys? Updated with AWS European Sovereign Cloud (GA January 2026).
Cloud Sovereignty for the Board — A 3-Tier Architecture That Maps Data Sensitivity to Control Level
Your board asks 'is our data safe in the cloud?' The answer is not yes or no — it is a classification decision that maps each workload to the right control tier. Here is the framework, with the metadata exposure gap most teams miss.
