project: unknownMission Request
← Back to Insights

Things Are About to Get ****** Messy

Anthropic is making a very specific argument: AI models are moving from being useful assistants for security work to becoming tools that can meaningfully speed up both vulnerability discovery and exploit development.

Their technical post on Claude Mythos Preview says the model can find real software flaws, including previously unknown ones, and in some cases turn them into working exploits. Their Project Glasswing announcement explains the response: Anthropic is not releasing Mythos broadly, and is instead giving controlled access to a group of major partners to help secure critical software first.

To understand why this matters, it helps to separate two ideas. Finding a bug is one thing. Turning that bug into something that reliably breaks into a machine, escapes a sandbox, or gets root access is another.

Anthropic’s main claim is that Mythos Preview is much better at the second step than their earlier models. In the technical post, they describe the model as able to identify and exploit zero-days across major operating systems and browsers, while Glasswing frames this as a potential shift in the balance between attackers and defenders.

What Anthropic Is Actually Saying

The term zero-day matters here. A zero-day is a vulnerability that was not publicly known before it was found. Anthropic argues that if a model discovers one, that is strong evidence it is not just replaying training data.

In the post, they say their evaluations increasingly focus on novel real-world bugs rather than old benchmark cases, because that is a better way to test whether the capability is real.

One example they highlight is an old OpenBSD bug in TCP SACK handling. Anthropic says Mythos found a bug rooted in OpenBSD’s implementation of Selective Acknowledgment that could let an attacker crash a machine remotely. OpenBSD later published errata describing the issue as “TCP packets with invalid SACK options could crash the kernel.” That does not independently prove every part of Anthropic’s narrative, but it does show there was a real OpenBSD issue of the kind they described.

Another example is stronger from a public verification point of view: Anthropic says Mythos autonomously found and exploited a FreeBSD NFS-related vulnerability, and the post identifies it as CVE-2026-4747. Anthropic presents this as a case where the model moved from scanning code to building a working exploit chain with little or no human help after the initial prompt.

What Makes This Different

The technical post also claims Mythos found vulnerabilities in FFmpeg, major browsers, Linux kernel code, cryptography libraries, web applications, and even a memory-safe virtual machine monitor.

But this is where caution matters.

Anthropic says that over 99% of the vulnerabilities they found had not yet been patched at publication time, so they withheld most technical details. Instead, they published SHA-3 hashes of reports or proof-of-concept documents as cryptographic commitments, promising to reveal the underlying material later after responsible disclosure windows close.

That means many of the biggest claims are not yet independently verifiable in full.

So the right way to read this is not as blind certainty, and not as automatic dismissal either. Some parts are already grounded in public fixes, advisories, and concrete examples. Other parts are still claims that will need time and later disclosure to verify properly.

Why Project Glasswing Matters

This is where Project Glasswing comes in.

Glasswing is Anthropic’s attempt to act as if the technical claims are serious enough to require defensive coordination right now. Anthropic says the project launched with partners including AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, along with more than 40 additional organizations.

Anthropic also says it will provide up to $100 million in usage credits and $4 million in direct donations to open-source security organizations.

The message is simple: even if the long-term effect of AI is to make software more secure, the transition period could be ugly.

If model-assisted exploit development gets much faster, then the time between a vulnerability becoming known and attackers having a usable exploit may collapse.

That is especially dangerous for N-days, meaning already-disclosed vulnerabilities that many systems still have not patched.

The Bigger Problem Was Already There

There is another layer to this that people do not like to talk about openly.

A lot of widely used software and services already sit under legal pressure from governments to enable access, scanning, interception, or technical compliance in various forms. But it would be inaccurate to say, flatly, that “most popular software has backdoors because the law requires them.”

There is no single broad law that simply mandates secret backdoors across all mainstream software.

What is true is that governments have repeatedly pushed for lawful access, message scanning, or technical capability requirements, and privacy groups have warned for years that these kinds of mandates can function like backdoor pressure even when they are described in softer language.

That matters because AI does not need to invent every weakness from scratch. It may simply make existing fragility much easier to discover, test, combine, and exploit.

In other words, AI may amplify a landscape that was already full of weak control points, hidden assumptions, and legal pressure toward more access.

We Are Only Seeing One Small Part of This

And we should be honest about something else.

In this story, we are talking about one company and one model family being described through a controlled public narrative.

What we can actually see is the consumer-facing layer: the blog posts, the partner program, the benchmark scores, and a small number of examples that have become public enough to discuss.

That is useful, but it is not the whole picture.

The strongest systems are not always the ones the public gets to use, and the systems that matter most may exist behind company walls, state labs, military programs, or private offensive security environments that are far less visible.

That is an inference, not a documented fact, but it is a reasonable one.

So while it would be wrong to claim as fact that fully unrestricted offensive AI is already everywhere, it is also hard to believe that every advanced effort is neatly confined to consumer products with polished safety language around them.

The more sober position is this:

We should assume that stronger, less visible, and less restricted systems are being actively pursued somewhere, and that they are only going to improve.

That is not proof of a secret program. It is just the direction of travel.

Why This Feels Different Now

Glasswing also tries to support the case with benchmark numbers.

Anthropic says Mythos Preview outperformed Opus 4.6 on its CyberGym vulnerability reproduction evaluation, and also posted stronger results on coding and agentic benchmarks such as SWE-bench Pro, Terminal-Bench 2.0, and SWE-bench Verified.

Anthropic’s argument is that cyber capability is not some narrow special trick. It is emerging from general gains in coding, reasoning, tool use, and autonomy.

That is part of why this story feels different.

This is not being presented as a model trained only to hack.

It is being presented as a model that got better across the board, and whose cyber capability emerged as a consequence of becoming more capable in general.

What Security Teams Should Take From This

These posts point to a few priorities that matter more than hype, headlines, or model branding.

Shorten patch cycles

If exploit development is accelerating, then the old buffer between disclosure and weaponization is getting smaller. Security fixes, dependency upgrades, and emergency patching need to move faster.

Use frontier models for defense now

Anthropic’s own recommendation is not to wait for a perfect future system. Use available frontier models today for bug triage, patch drafting, review support, investigation assistance, and defensive automation.

Focus on identity and control planes

Credentials, service account tokens, cloud permissions, routers, DNS, and workload control paths are all part of the modern attack surface. These are not side issues anymore.

Treat trusted tools as part of the threat model

If AI can help attackers move faster, then ordinary administrative utilities, development tools, and management software become even more important to monitor and constrain.

Prepare for a messy transition

The real danger is not just some final superintelligent future. It is the awkward in-between period where defenders are still adapting, critical systems still run legacy code, governments still push for more access, and attackers only need a few weak points to cause real damage.

The Real Takeaway

The cleanest summary is this:

Anthropic is saying that AI is no longer just helping people find bugs. It is starting to help turn bugs into attacks at a speed that could seriously stress the current security model.

At the same time, the legal and technical environment was already messy before AI arrived.

AI may now amplify all of it, the defensive good and the offensive worst-case alike.

Glasswing is Anthropic’s answer: keep the strongest system controlled, give defenders early access, and try to harden important software before this capability spreads further.

Whether that will be enough is the open question.

References