Anthropic Seeks AI-Blocking Powers, Gets Blocked Instead
On June 10, 2026, Anthropic published a detailed policy proposal in two frameworks — the Advanced AI Framework and the Economic Policy Framework — formally requesting the US federal government for “the legal authority to block or deter the deployment of models that pose a significant risk of catastrophic harm”, with graduated civil penalties for violations (Anthropic, June 10, 2026). Two days later, a US government export control directive orders the immediate suspension of access to Claude Fable 5 and Claude Mythos 5 — Anthropic’s flagship models, launched just three days earlier — for all customers, not just foreign ones. Anthropic complies, but in the same statement cites its own proposal to challenge the process by which the government exercised that power (Anthropic, June 12, 2026).
It’s a 48-hour narrative arc, signed by a single actor on both sides, that for the first time shows a major lab requesting regulatory intervention power over its own models and that power being exercised against its models without following the process it itself had invoked.
What happened
June 10, 2026 — the proposal. Anthropic publishes on its website a two-part policy framework, written primarily with the US federal government in mind, but with sections relevant to the rest of the world. The first part, the Advanced AI Framework, defines a precise threshold for frontier models — trained with more than 10²⁵ FLOP, developed by companies with over $500 million in AI-related revenue or over $1 billion in AI R&D spending — and identifies four categories of catastrophic risk: biological, cyber, loss of control, and automated R&D that accelerates the other three. The most politically significant request in the framework is clear: the government should be able to block or deter the deployment of models that pose a significant catastrophic risk, with civil penalties tied to annual global revenue, graduated for repeated violations.
CEO Dario Amodei published the same day a personal essay, “Policy on the AI Exponential”, framing the proposal in the US regulatory tradition (darioamodei.com, June 2026). The analogy is deliberately chosen — airplanes, not nuclear weapons: “Frontier AI models, like airplanes, should be required to go through technical testing and auditing, and their release should be blocked or reversed as a threat to public safety if they do not meet high standards of safety.” Amodei explicitly cites the Trump Executive Order of June 2026 as incremental progress in the right direction, and recommends more action, not less.
The Advanced AI Framework isn’t a blank check for regulation. The original PDF explicitly includes “concrete safeguards that would prevent that power from being misused”, and specifies that blocking authority should be “scoped to the above four specific risks” with “protective measures against political favoritism or arbitrary decisions”. The proposal is a compact: yes to power, but with a statutory, transparent, fair, clear, and grounded in technical facts process. These are exactly the words the June 12 statement will reuse — and contest as missing.
June 12, 2026 — the directive. At 5:21 PM ET on Thursday, June 11, 2026, Anthropic received from the US government an export control directive ordering the immediate suspension of access to Fable 5 and Mythos 5 for any “foreign national”, wherever located, including Anthropic employees of foreign nationality. The net effect — announced by Anthropic in a statement on June 12 — was the disabling of both models for all customers, American and non-American, to ensure compliance. Access to all other Anthropic models — Sonnet, Opus, Haiku, other classes — was not affected.
The government letter, Anthropic writes, “did not provide specific details of its national security concern”. Anthropic’s understanding is that the government became aware of a method to bypass Fable 5’s safeguards — a jailbreak. Anthropic examined a demonstration of that technique and described the results as “a small number of previously known, minor vulnerabilities”, “relatively simple”, replicable “by other publicly-available models … without requiring a bypass”. The explicit reference is to OpenAI GPT-5.5, whose system card — published April 23, 2026, updated April 24 — classifies GPT-5.5 as “High capability in the Cybersecurity domain, but below Critical” and “High capability in the Biological and Chemical domain”. For Anthropic, that capability isn’t specific to Fable 5, and doesn’t justify a generalized recall.
Anthropic announced: “We are complying with the government’s legal directive and are removing access to Fable 5 and Mythos 5 for all users. However, we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people. If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”
And again, in closing: “As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.”
The phrases are exactly those the Advanced AI Framework had invoked as prerequisites. Anthropic cites two links in the preceding line: the June 10 policy framework and Amodei’s essay. The message is clear: the proposal exists, the government is using it as a basis for action, but is using it in a form that the proposal itself excluded.
Why it matters
1. For the sector: the first operational precedent of blocking power without the statutory process that was requested. Fable 5 had been launched on June 9, 2026, as the first publicly available “Mythos” class model (the class with more aggressive cyber safeguards), distributed via API. Three days later it’s unreachable via API by any customer. It’s not the first time a Western government has suspended a frontier model — it’s the first time the suspension power is exercised against the lab that, two days earlier, publicly requested that power. For OpenAI, Google DeepMind, Meta, xAI, Microsoft AI, the question becomes operational: if “narrow, non-universal jailbreak replicable by GPT-5.5” becomes the operational basis for suspending a commercial model, the risk surface of anyone releasing a frontier model increases. Anthropic itself writes: “If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”
2. For developers: the distinction between “all customers” and “foreign customers” is operational, not semantic. The export control directive prohibits access to foreign nationals, wherever. For compliance, Anthropic disables access also for customers who aren’t foreign nationals. The practical effect is global. For anyone with eval, logs, agents, monitoring, or work-in-progress based on claude-fable-5 or claude-mythos-5, a fallback is needed. For users of Sonnet, Opus, or Haiku, the API is unaffected — but the regulatory precedent is.
3. For the policy beat: the policy artifact of the week isn’t the statement, it’s the proposal. The Advanced AI Framework is the most detailed AI policy proposal ever published by a major lab: 27 pages of concrete framework, with measurable compute thresholds (10²⁵ FLOP), measurable applicability criteria ($500M revenue or $1B AI R&D), enforcement mechanisms (graduated civil penalties), and safeguards against misuse of the power itself. It’s the proposal that the US government, two days later, decides not to follow in the form Anthropic had requested. The real policy question isn’t “what happens to Fable 5” — it’s “what happens to the relationship between blocking power and statutory process in the next twelve months.”
4. For Anthropic watchers: a circle closes that was opened three days earlier, but isn’t resolved. The same model (Fable 5) was at the center of a different story on June 10-11, 2026, when Anthropic publicly reversed course on its invisible safeguards for the “distillation” category — a reversal on a transparency choice, not on the model. The June 12 suspension is a different story: it’s not Anthropic’s choice, it’s the US government’s choice, and Anthropic’s resistance is about the process, not the substance of the directive. Two stories that intersect on the same model, in two consecutive days, and together tell a precise idea: the frontier between “Anthropic decides what’s safe,” “customers decide what’s safe,” and “the government decides what’s safe” has visibly shifted.
What to watch
- The full text of the export control directive. Anthropic writes that the government letter “did not provide specific details of its national security concern” and that it will share “more details over the next 24 hours” from June 12. The full text of the directive — or an unclassified version — is the next critical event. Without it, you can’t assess whether the legal basis holds, or whether the cited jailbreak is truly equivalent to the cyber capabilities GPT-5.5 or other public models demonstrate.
- OpenAI’s position. Anthropic explicitly cites GPT-5.5. If OpenAI confirms that GPT-5.5 produces the same result, the suspension of Fable 5 becomes technically unsustainable as a safety measure, because it’s not extended. If OpenAI denies, Anthropic’s claim weakens. OpenAI had not commented as of June 13, 2026.
- Reaction from Google DeepMind, Meta, xAI, Microsoft. Anthropic explicitly invited the industry to respond. A single or joint statement (Frontier Model Forum or equivalent) is a strong signal. Monitor for 7-14 days.
- The text of the Trump Executive Order of June 2026 that Amodei cites as “incremental progress” — and the relationship between that EO and the June 12 export control directive. If the EO provides the regulatory basis for the directive, it’s a precedent for executive power exercise in a sector that until now was governed by guidelines and voluntary Responsible Scaling Policies from individual labs.
- Availability of Fable 5 and Mythos 5. The statement says “working to restore access as soon as possible”. There’s no date. Don’t speculate on when they’ll return: monitor the Anthropic blog and the
@AnthropicAIaccount for follow-up. - The fate of the Advanced AI Framework as a legislative proposal. Amodei announces in the June 10 essay that Anthropic intends to provide “substantial financial backing” to a legislative process. The June 12 directive is the first exercise of an affine power, but in a form the proposal excluded. The framework’s trajectory over the next 3-6 months is the leading indicator of how the US policy apparatus will evolve.
Risks and caveats
- Proposal, directive, position: three different things. The June 10 proposal requests a blocking power with a statutory process (transparent, fair, clear, grounded in technical facts). The June 11 directive (announced June 12) exercises a blocking power without that process. Anthropic’s position on the directive is: I comply, but I challenge the procedure. The article must keep these three points separate with precision.
- The “other models do the same jailbreak” claim is Anthropic’s, not a verdict. Anthropic writes of having examined “a report that we believe is the basis of the government’s directive” and that “the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5)”. That’s Anthropic’s review, not an independent audit. The OpenAI GPT-5.5 system card only confirms GPT-5.5 is classified as “High capability in the Cybersecurity domain, but below Critical” — not that the specific capability cited by Anthropic is replicable point-by-point on GPT-5.5. OpenAI had not commented.
- Fable 5 and Mythos 5 are not open source / open weights. They never were. They’re closed-weight models distributed via API and consumer product. A sentence in “What to do” for anyone thinking about self-hosting Mythos 5 is appropriate: it wasn’t a real option even before June 12.
- Fable 5 and Sonnet/Opus aren’t comparable classes. Fable 5 is Mythos-class, with more aggressive cyber safeguards (30-day retention requirement, extensive red-teaming with AISI UK and third parties, “defense in depth”). Sonnet, Opus, Haiku are different classes and aren’t subject to the same trade-offs. The article doesn’t compare capabilities across classes.
- The directive text isn’t public. Anthropic summarizes its content. Any assertion about “what the directive says” should be reported as “according to the June 12 Anthropic statement,” not as the directive’s own text. The legal basis invoked by the government — “national security authorities” — is standard language; Anthropic doesn’t challenge the legal basis itself, it challenges procedural fairness.
- It’s not an “Anthropic vs OpenAI” or “Anthropic vs US government” story. OpenAI is cited once (GPT-5.5 system card), and only because Anthropic cites it. The US government has no — as of June 13, 2026 — official spokesperson who has commented on the directive beyond the letter to Anthropic. The article is about a proposal and a directive, not a rivalry.
- It’s not a story about the AIN-55 (Fable 5 invisible safeguards) reversal. The cross-link serves, in one sentence, to help readers who read the AIN-55 article place the two events. No more.
What to do
For teams using Fable 5 or Mythos 5 via API today. The explicit rollback recommended by Anthropic, according to the June 9 launch post, is Claude Opus 4.8 (the previous class that Fable 5 used internally as a fallback for late queries). If Opus 4.8 isn’t available or isn’t suitable for the workload, Sonnet 4 or Haiku 4 are alternatives, with the expected cost/capability trade-offs. For production eval: re-run eval on the fallback model, update monitoring dashboards, and document the rollback cause. For work-in-progress or production agents calling claude-fable-5 or claude-mythos-5: update config and expect a period of instability until restoration (no date).
For teams using Sonnet, Opus, or Haiku. The API is unaffected. The regulatory precedent is. It’s worth: (a) mapping deploy dependencies to which model classes and which operating jurisdictions; (b) tracking whether precedent replication to other frontier models impacts your pipelines; (c) if operating in regulated sectors (finance, healthcare, public sector, defense), informing legal and compliance teams that a new regulatory tool (export control directive) has been demonstrated against a commercial production model, within 72 hours of launch.
For AI policy practitioners. The Advanced AI Framework is the first detailed AI policy proposal published by a major lab. Three things to do: (1) read the original 27-page PDF, not just the blog summary; (2) also read Amodei’s essay, which contains the regulatory framing and the FAA analogy; (3) follow the framework’s trajectory on Capitol Hill and in Brussels over the next 3-6 months — the proposal is explicitly presented as the basis for a legislative process.
For those following the AIN-55 (Fable 5 invisible safeguards) story. That was a story about an Anthropic product choice (invisible safeguards for distillation) and its correction. This is a story about a government directive and Anthropic’s position toward it. Same model, two different events, two different policy implications. Reading them as one story would confuse the picture.
Verdict
On June 10, 2026, Anthropic published a proposal asking the US government for blocking power over the riskiest frontier models, with a statutory, transparent, fair, clear, and grounded in technical facts process. On June 12 — 48 hours later — that power was exercised on Fable 5 and Mythos 5 without following that process. Anthropic complies and, in the same statement, cites its own proposal to challenge the procedure. It’s a rare moment: a major lab publicly requests stricter rules for itself, and those rules become operational in 48 hours in a form the lab contests. The open question isn’t “when will Fable 5 and Mythos 5 return,” but whether the regulatory precedent holds, and whether other labs join Anthropic’s position or distance themselves from it.
“We believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.”
— Anthropic, statement of June 12, 2026, in Italian translation of the original English passage.