Anthropic Says Claude Mythos Is Too Dangerous to Release — And They Might Be Right

In an industry where everyone races to ship, Anthropic just did the opposite.

They built their most powerful model yet — Claude Mythos Preview — and then decided not to release it to the public. Not because it wasn't ready. Because it works too well.

Within weeks of internal testing, Mythos had autonomously discovered thousands of zero-day vulnerabilities across every major operating system and web browser. It found a 27-year-old flaw in OpenBSD that could crash systems. It uncovered a 16-year-old vulnerability in FFmpeg that automated tools had missed millions of times. It wrote a browser exploit that chained together four separate vulnerabilities, including a JIT heap spray that escaped both renderer and OS sandboxes.

No human told it how. It just... figured it out.

As an engineering manager who spends his days thinking about shipping code safely, this stopped me cold.

The Numbers Don't Lie

Let's ground this with data. Here's how Mythos stacks up against Claude Opus 4.7 (Anthropic's current flagship):

Benchmark	Claude Opus 4.7	Claude Mythos Preview	Jump
CyberGym (vulnerability analysis)	66.6%	83.1%	+25%
SWE-bench Verified (software engineering)	80.8%	93.9%	+16%
SWE-bench Pro	53.4%	77.8%	+46%

That SWE-bench Pro jump is wild. We're talking about a model that went from solving roughly half of hard software engineering problems to solving three-quarters of them. And the cybersecurity benchmarks aren't theoretical — Mythos proved them by finding real vulnerabilities in production software that humans and automated tools had missed for decades.

The kicker? Anthropic says these capabilities emerged from "general improvements in code, reasoning, and autonomy" — not from explicit security training. Mythos wasn't built to be a hacking machine. It just became one because it got good enough at reasoning about code.

That's the part that should make you uncomfortable.

Brilliant Responsibility or Brilliant Marketing?

Let's be honest — when a company says "our product is too dangerous to release," there are two possible readings.

Reading one: genuine responsibility. Anthropic has always positioned itself as the safety-first AI lab. Their Constitutional AI work, their responsible scaling policies, their willingness to publish research that makes their own products look risky — there's a pattern here. When they say Mythos could be weaponized, they're being consistent with years of behavior.

Reading two: incredible marketing. Nothing sells like scarcity and danger. "Too dangerous to release" is the kind of headline that writes itself. Every tech journalist, every security researcher, every curious developer now needs to know about Mythos. You can't buy publicity like that.

Here's my take: both readings are true simultaneously, and that's fine.

The AI safety crowd sometimes acts like commercial incentives and genuine caution can't coexist. They can. Anthropic can genuinely believe Mythos is dangerous and understand that saying so publicly builds their brand. The question isn't about motives — it's about whether their actions match the concern.

And on that front, what they did is interesting.

Project Glasswing: Defenders Get First Dibs

Instead of a public release, Anthropic launched Project Glasswing — a controlled-access program that gives defensive security teams early access to Mythos.

The founding partners read like a who's who of tech infrastructure: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, NVIDIA, Palo Alto Networks, the Linux Foundation, and more — over 40 organizations in total. Anthropic committed up to $100 million in usage credits and $4 million in direct donations to open-source security organizations.

The logic is straightforward: let the defenders find and patch vulnerabilities before attackers get access to similar capabilities. It's responsible disclosure, but at AI scale.

Does it work? Honestly, I'm cautiously optimistic but skeptical.

The problem is time. Anthropic acknowledges that over 99% of the vulnerabilities Mythos found haven't been patched yet. That's a massive attack surface sitting in a database somewhere, protected only by access controls and NDAs. And we know how well those hold up — Anthropic itself accidentally leaked Mythos's existence through a data breach before the planned announcement, and then leaked Claude Code's source code days later.

If the company that built the model can't keep its own secrets, how confident should we be in the Glasswing perimeter?

What This Actually Means for Engineering Teams

Okay, enough industry commentary. If you're an engineering manager or tech lead, here's what I think matters:

Your threat model just changed. When a non-specialist can use an AI to find remote code execution vulnerabilities overnight — which is what Anthropic claims happened during testing — the economics of attacks shift dramatically. The barrier to entry for sophisticated exploits drops from "nation-state resources" to "API access."

Security-by-obscurity is truly dead. It was always a bad strategy, but now it's suicidal. If Mythos can find a 27-year-old bug in OpenBSD — one of the most security-audited codebases in existence — your internal APIs and hastily-written microservices don't stand a chance.

AI-assisted code review becomes table stakes. This is the silver lining. If AI can find vulnerabilities this effectively, it can also be pointed at your own codebase before attackers do. I expect tools like this to become standard in CI/CD pipelines within the next year or two.

The "dual-use" conversation is no longer theoretical. Every capability that makes AI useful for defenders makes it equally useful for attackers. The same model that helps your security team find bugs can help someone else exploit them. We've been having this conversation abstractly for years. Mythos makes it concrete.

The Uncomfortable Bottom Line

Here's what I keep coming back to: Anthropic didn't train Mythos to be a security tool. The cybersecurity capabilities emerged from making it better at general reasoning and coding. That means the next model — from Anthropic, or Google, or OpenAI, or anyone else — might develop similar or greater capabilities as a side effect of normal improvements.

You can't put this genie back in the bottle by deciding not to train for security. It comes for free once the reasoning is good enough.

Anthropic's controlled release through Project Glasswing is, I think, the right call for right now. Defenders need a head start. But it's a temporary measure. Frontier AI capabilities will advance substantially in the coming months — Anthropic's own words — and the window between "defenders have it" and "everyone has it" will keep shrinking.

The era where we debate whether AI can be dangerous ended the moment Mythos autonomously chained four vulnerabilities into a working browser exploit. The question now is whether we're building our defenses faster than we're building the weapons.

From where I sit, managing a team that ships production code every sprint — I'm not sure we are. And that's what keeps me up at night.

What do you think — is controlled release the right approach, or just a speed bump? I'd love to hear from other engineering leads navigating this. Drop me a line or find me on LinkedIn.

Related Reading:

Anthropic Says Claude Mythos Is Too Dangerous to Release — And They Might Be Right

Anthropic Says Claude Mythos Is Too Dangerous to Release — And They Might Be Right

The Numbers Don't Lie

Brilliant Responsibility or Brilliant Marketing?

Project Glasswing: Defenders Get First Dibs

What This Actually Means for Engineering Teams

The Uncomfortable Bottom Line

Vinod Kurien Alex

Related Articles

Fable 5 and Anthropic Safety Tiers: What Gated Models Signal

The Next Leap Isn't a Smarter Model — It's Models That Review Each Other

The SaaSpocalypse's Silver Lining: Why Indian IT Could Win the AI Migration Wars