AI News 9 min read

Claude Mythos Preview: Why Anthropic Locked Its Best Security Model Behind a Wall

ai.rs Apr 8, 2026

On April 7, Anthropic announced Claude Mythos Preview alongside Project Glasswing — a frontier AI model purpose-built to find and exploit software vulnerabilities, paired with a partner program that decides who gets to use it.

Mythos is not on the API price list. It is not on a waitlist page. It is not coming to Claude.ai next week. If you are reading this and you do not work for AWS, Apple, Cisco, Google, Microsoft, or one of about 50 other vetted organizations, you cannot have it. That is not an oversight. That is the entire point of how Anthropic shipped this model.

Here is what Mythos actually does, who is in Glasswing, and why the access wall exists.


What Mythos Found

Anthropic led the announcement with two findings that are difficult to dismiss as benchmark theater.

A 27-year-old vulnerability in OpenBSD that allowed remote crashes. OpenBSD is the operating system whose entire brand identity is built on aggressive code review and proactive auditing. A bug that survived 27 years inside the OpenBSD codebase is, by definition, a bug that human reviewers were never going to find on their own.

A 16-year-old flaw in FFmpeg that automated coverage-guided fuzzers had executed the surrounding code path more than 5 million times without triggering. This is the more technically interesting finding. Modern fuzzing is supposed to be the gold standard for catching memory corruption in C codebases. 5 million hits with no crash means the bug is reachable but only under specific semantic conditions — exactly the kind of "needs to actually understand the code" gap that LLMs are theoretically good at closing.

Anthropic also reported multiple Linux kernel privilege-escalation vulnerabilities and claims "thousands of high-severity vulnerabilities" in total across operating systems, browsers, and foundational libraries.

The One Number That Matters

Benchmark Mythos Preview Opus 4.6
CyberGym (vulnerability reproduction) 83.1% 66.6%

CyberGym measures whether a model can take a vulnerability description and actually reproduce a working exploit against the real target codebase. It is not multiple choice. It is not pattern matching against CVE databases. It is "build the thing that triggers the bug."

Going from 67% to 83% on a benchmark like that is not an incremental improvement. It is the difference between a useful research assistant and an autonomous agent you can leave running against a codebase overnight and trust to come back with reproductions instead of false positives.

Anthropic explicitly says Mythos "performs autonomously without human steering in many cases." That phrasing matters. Most AI security tooling today still requires a researcher in the loop to triage and verify. Mythos, in the cases where it works, does not.

Who Is In Glasswing

Project Glasswing launched with 12 founding partners:

  • Cloud and infrastructure: AWS, Google, Microsoft
  • Hardware and operating systems: Apple, Cisco
  • Plus seven others spanning major technology vendors and security organizations

Beyond the founding 12, Anthropic added 40+ more organizations focused on critical infrastructure protection and open-source maintenance. The selection criteria, as described in the announcement:

  1. You maintain code that other people depend on at scale (operating systems, browsers, kernels, foundational libraries)
  2. You operate critical infrastructure (cloud platforms, networking, finance)
  3. You are an open-source security organization with a track record

Notably absent from the public list: penetration testing firms, bug bounty platforms, and anyone whose business model is selling vulnerability research to third parties. That is a deliberate choice, and we will get to why.

Why It Is Gated

There are three reasons Mythos is not generally available, and they reinforce each other.

1. The dual-use problem is unavoidable

A model that can autonomously find a 27-year-old bug in OpenBSD is also a model that can autonomously find unknown bugs in your production stack. The capability does not care about the operator's intent.

Anthropic could have published Mythos behind a standard "acceptable use policy" click-through, the way every other AI lab handles dual-use risk. They chose not to. The math is brutal: if even a small fraction of paying API customers used Mythos to find zero-days for sale, the result would be a measurable spike in real-world exploitation against the same critical infrastructure Anthropic is trying to protect.

Gating by partnership is an admission that policy alone is insufficient when the capability gap is this large.

2. Pricing as soft access control

When Mythos eventually does reach general availability, it will cost $25 per million input tokens and $125 per million output tokens. For comparison, Claude Opus 4.6 sits at roughly $15 input and $75 output per million tokens — Mythos is approximately 1.7x more expensive on output than the most capable general-purpose Claude model.

That premium is doing two things at once.

First, it reflects real cost: Mythos is almost certainly larger than Opus, almost certainly does more internal reasoning per token, and almost certainly was more expensive to train. You do not get autonomous CyberGym performance for free.

Second, and more importantly, the price is a soft access control mechanism. At $125 per million output tokens, you do not casually point Mythos at every public GitHub repository to see what it finds. The economics make opportunistic mass-scanning prohibitively expensive while keeping targeted defensive use affordable for organizations that have a specific codebase to harden.

This is the same logic that keeps satellite imagery affordable for journalists but expensive for stalkers. Pricing is not just revenue. It is a filter.

3. The subsidy structure tilts the balance toward defenders

Anthropic committed $100 million in usage credits to Glasswing partners and donated $4 million to open-source security organizations. Read those numbers in context: defenders are getting subsidized to use Mythos at zero or near-zero marginal cost, while everyone else faces full price plus access restrictions.

That is a deliberate asymmetry. Anthropic is paying to put Mythos in the hands of the people who maintain the code, before it is available to anyone who might want to exploit it. The window between "defenders can use this" and "attackers can buy this" is the entire game, and Anthropic is spending $100 million to widen it.

Whether that strategy actually works depends on how long the window stays open. If a competing lab ships an equivalent capability without the access controls, the asymmetry collapses overnight. If Anthropic stays meaningfully ahead on this specific capability for six months, defenders get a meaningful head start on hardening the most-used software on the planet.

When You Will Get Access

The official answer is "after we develop appropriate safeguards with an upcoming Claude Opus model." The unofficial reading: months, not weeks, and tied to a future release rather than a fixed date.

Realistically, Mythos in its current form is unlikely to be sold directly to the open API market. What seems more probable is that the techniques pioneered for Mythos — the training data, the autonomous-loop scaffolding, the safety filters — will be folded into a future general-purpose Opus release in a more constrained form. You will get some of the capability, with guardrails that prevent the most concerning use cases.

If you want the unconstrained version, your path is Glasswing membership. The application process is not public, but the criteria are: maintain critical software, demonstrate operational security, commit to responsible disclosure.

What To Actually Do

If you maintain critical infrastructure or foundational open-source software: investigate Glasswing. The 40+ non-founding partners suggest the program is actively expanding, and the subsidized usage credits are the cheapest security audit you will ever get.

If you build products on the Claude API: nothing changes today. Opus 4.6 and Sonnet 4.6 remain your daily drivers. But the existence of Mythos is a clear signal that the gap between "the best model Anthropic has trained" and "the best model Anthropic will sell you" is widening — and for the first time, Anthropic is being transparent about that gap rather than pretending it does not exist.

If you run a security team at a normal company: wait. The Mythos-derived safeguards in the next Opus release will likely cover the use cases you care about (code review, vulnerability triage, secure-coding assistance) without the access friction. Spending engineering time on Glasswing applications when you do not maintain a kernel is probably not the best use of the quarter.

The Bigger Signal

Set aside the specific capability for a moment. The more important thing about Mythos is that Anthropic chose to ship a frontier model with deliberate access controls, full stop. Every previous Claude release has been framed as "as broadly available as we can make it." Mythos is the first time Anthropic has publicly drawn a line and said: this one is too dangerous to sell to everyone, and we are going to gate it on who you are rather than what you promise.

That precedent matters more than the OpenBSD bug. If Mythos works the way Anthropic claims, expect more specialized frontier models with similar access structures — for biotech, for finance, for any domain where the dual-use math gets uncomfortable. The era of "one model, one API, one price list" is not over, but it is no longer the only shape an AI lab can take.

For now, Mythos exists, it is genuinely impressive, and you cannot have it. That is the story.

Share: Post Share

Related Articles