Anthropic AI models have been at the centre of some of the most breathless AI coverage of the past year — and now they’re at the centre of something far more consequential. Late last week, Anthropic confirmed it had been forced to take its two most powerful frontier models, Claude Fable 5 and Mythos 5, completely offline after receiving what it described as a U.S. government export control directive. The reason? A reported jailbreak that Anthropic insists was far less dramatic than Washington apparently feared.
- Anthropic AI models Claude Fable 5 and Mythos 5 were abruptly disabled after a U.S. government export control directive citing national security.
- The government’s concern centred on a reported jailbreak of Anthropic AI models — one Anthropic says demonstrated only minor, already-known vulnerabilities.
- Anthropic had spent weeks marketing Mythos as dangerously powerful, which may have primed regulators to overreact when a bypass emerged.
- The incident shows that AI safety messaging, however well-intentioned, carries real regulatory consequences when governments take it literally.
Table of Contents
What the Export Control Directive Actually Says
The order, as Anthropic describes it, is an export control directive prohibiting foreign nationals from using the Anthropic AI models — not just inside the United States, but anywhere in the world. The national security concern underpinning it was left unspecified, at least in Anthropic’s public statement. ‘We believe this is a misunderstanding and are working to restore access as soon as possible,’ the company wrote on its website.
The practical problem Anthropic faced was immediate: complying with a restriction keyed to nationality is operationally messy at the best of times. When you employ non-U.S. nationals at your own offices, it becomes nearly unworkable. So rather than attempt a half-measure, Anthropic pulled both models entirely and is apparently in active negotiations with the government to get them back online. It’s a blunt solution to a complicated problem, but probably the right one given the stakes of getting it wrong.

The Jailbreak That Spooked the Government
Here’s where things get interesting. According to Anthropic, the trigger for the directive was the government learning about a method for jailbreaking Fable 5 — that is, coaxing the model into ignoring its safety guardrails. The company reviewed the specific technique that was demonstrated to officials and came away unimpressed.
‘We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass.’
In other words, Anthropic’s position is that the jailbreak in question wasn’t exposing anything novel or uniquely dangerous. The vulnerabilities were already known, relatively trivial, and reproducible without even needing to bypass Fable 5’s safeguards. If that assessment is accurate — and there’s no independent verification yet — then the government appears to have overreacted to a demonstration that overstated the risk.
Anthropic also points out that it was transparent about jailbreak limitations from day one. When Fable 5 launched, the company’s own blog post acknowledged that making a model completely jailbreak-proof is ‘likely impossible.’ The goal, Anthropic wrote, is to make any remaining jailbreaks ‘sufficiently slow and costly that we can detect and prevent them before they are used at scale.’ That’s an honest framing of where frontier AI safety actually stands right now — but it’s not a particularly reassuring sentence for a government official who’s just watched a demonstration of a bypass. Anthropic AI models operate under this same principle, meaning some residual jailbreak risk is an accepted reality across the entire product line.
Anthropic AI Models and the Hype That May Have Backfired
It’s hard to look at this situation and not notice the irony at its core. For months, Anthropic had been loudly — and arguably deliberately — positioning its Mythos-class Anthropic AI models as almost frighteningly capable. When it first unveiled the Mythos Preview model back in April, the company chose not to release it publicly. Instead, it published a detailed system card cataloguing the model’s more alarming potential: deceptive behaviour, the ability to break containment from restricted environments, and — most viscerally — what it described as being ‘capable of significant cross-domain synthesis relevant to catastrophic biological weapons development.’
That language was always going to travel far beyond the AI research community, and it did. The New York Post ran a piece quoting computer scientist Roman Yampolskiy warning that AI might soon generate ‘hacking tools, biological weapons, chemical weapons, novel weapons we can’t even envision.’ That phrase — ‘weapons we can’t even envision’ — made it into the headline. British government officials and finance sector leaders reportedly scrambled to form contingency plans. According to the New York Times, the Trump administration’s previously hands-off stance on AI regulation shifted after the Mythos announcement, directly contributing to the development of a safety-focused AI executive order that the president signed roughly a week ago.

Then, after all that, Anthropic launched Fable 5 and Mythos 5 anyway. Fable 5 was billed as ‘a Mythos-class model that we’ve made safe for general use,’ but with capabilities that ‘exceed those of any model we’ve ever made generally available.’ Mythos 5 went out through the restricted Project Glasswing programme to a limited set of vetted partners. Brian Merchant at Blood in the Machine put it bluntly: Anthropic had spent months telling the world it had built something too dangerous to release, then released it. The Anthropic AI models at the centre of that criticism — Fable 5 and Mythos 5 — are now offline as a direct consequence.
Hours after Merchant published those words, the export control directive arrived.
A Safety Strategy That Created Its Own Risk
What we’re watching is a feedback loop that the AI industry hasn’t fully reckoned with yet. When companies like Anthropic describe their Anthropic AI models in the most alarming possible terms — partly to manage expectations, partly to signal responsibility, partly because some of the capabilities genuinely are concerning — they’re also training regulators and policymakers to treat those systems as existential threats. When a jailbreak then surfaces, however minor, it lands in a context where officials have already been told these models could reshape global security.
The safeguards themselves had already generated controversy before the government stepped in. One guardrail built into Fable 5 — designed to quietly penalise users who tried to abuse the system — was widely criticised as ill-conceived, and Anthropic ended up apologising for it. That stumble, combined with the jailbreak report, may have been enough to push the government from cautious observer to active intervener.

It’s also worth flagging that Anthropic wasn’t operating in a vacuum. The company says it worked with both the U.S. and U.K. governments, as well as multiple private organisations, to develop the safety framework around these models before launch. That collaboration clearly wasn’t enough to prevent the shutdown — which raises a genuine question about what ‘working with government’ actually accomplishes if a single jailbreak demonstration can undo months of joint preparation.
One additional detail that adds texture to all of this: Anthropic retains user data for its Mythos-class Anthropic AI models more extensively than it does for standard models. That’s a deliberate choice, presumably to monitor for misuse. It’s also the kind of policy detail that tends to get lost in the noise of capability announcements and safety theatre, but it matters for anyone thinking about who actually has eyes on how these models are being used.
What Comes Next for Frontier AI and Government Oversight
Anthropic says it’s working to restore access to the Anthropic AI models as quickly as possible, and the company’s framing suggests it expects this to be resolved relatively soon. Whether that optimism is warranted depends entirely on how receptive the government is to the argument that the jailbreak in question was trivial. That’s not an easy sell to officials who’ve spent the better part of a year being told these systems are uniquely dangerous.
The broader pattern here is one the industry needs to take seriously. The U.S. Bureau of Industry and Security has steadily expanded its remit into AI and advanced computing over the past few years, and this incident suggests that frontier model releases are now firmly within its line of sight. If every major model launch becomes a potential trigger for export control action — especially in the aftermath of a public jailbreak disclosure — the commercial calculus around how companies talk about their models’ capabilities is going to shift. Anthropic’s approach of leading with danger and following with deployment may have made perfect sense as a communications strategy. As a regulatory strategy, the results speak for themselves. For now, Anthropic AI models remain unavailable to most of the world, and the industry is left watching to see how the company navigates its way back.
Source: Gizmodo
Frequently Asked Questions
Why were Anthropic AI models disabled by the U.S. government?
The U.S. government issued an export control directive barring foreign nationals from accessing the models, apparently triggered by its belief that a jailbreak bypassing Fable 5’s safety guardrails had been discovered. Anthropic says it reviewed the technique and found it exposed only minor, previously known vulnerabilities also discoverable in other public models.
What is Project Glasswing and how does it relate to this shutdown?
Project Glasswing is Anthropic’s restricted access programme that lets vetted partners test its most powerful Mythos-class models. Mythos 5 was released exclusively through Glasswing before the export control order forced both it and Fable 5 offline shortly after launch.
Is a perfectly jailbreak-proof AI model possible?
Anthropic says it’s ‘likely impossible to completely prevent universal jailbreaks.’ Its stated goal is to make any successful bypass slow, costly, and narrow enough to detect before it can be exploited at scale.
What happens to non-U.S. users of Anthropic’s frontier models?
The export control directive specifically targets non-U.S. nationals, preventing them from accessing the models inside or outside the country. Anthropic pulled access globally rather than risk non-compliance, partly because non-U.S. nationals are employed at the company itself.

