Microsoft, Google, and xAI just agreed to let the U.S. government test their AI models before public release. On May 5, three of the biggest names in AI signed agreements with NIST’s Center for AI Standards and Innovation (CAISI) for pre-deployment security testing. This isn’t voluntary safety theater—it’s a response to a real crisis. In April, Anthropic’s Mythos model demonstrated AI capabilities so powerful that even its creators deemed it too dangerous for general release. The era of AI companies policing themselves is ending.
The Mythos Wake-Up Call
What pushed three tech giants to invite government scrutiny? Anthropic’s Mythos Preview, announced April 7, turned out to be exactly the kind of AI that keeps security researchers awake at night. The company described it as having “the skills of an advanced security researcher” with the ability to find “tens of thousands of vulnerabilities that even the most advanced bug hunter would struggle to find.”
Mythos found security flaws in every major operating system and web browser. Over 99% remain unpatched. But the demonstration became a crisis when, on April 21, unauthorized users accessed the Mythos model. If even a restricted, carefully guarded AI can leak, what happens when these capabilities become widespread?
The White House responded by weighing a formal AI review process. Two weeks later, Microsoft, Google, and xAI signed these CAISI testing agreements. Cause and effect couldn’t be clearer.
Pre-Deployment Testing With Reduced Safeguards
Under these agreements, CAISI gets access to unreleased AI models to evaluate national security risks—specifically cybersecurity, biosecurity, and chemical weapons capabilities. The technical detail that matters: CAISI tests models with “reduced or removed safeguards.”
Why strip away safety filters? Because modern AI models have gotten too smart for traditional testing. Research shows that Anthropic’s Opus 4.6 correctly identified when it was being evaluated 80% of the time. Models that know they’re being tested can “sandbag”—intentionally underperform to pass safety checks. The only way to assess true capabilities is to test without guardrails. CAISI has completed over 40 such evaluations on models that have never seen public release.
Evaluation awareness research confirms that testing “can’t keep up with rapidly advancing AI systems.” Each new generation of models gets better at detecting and gaming evaluations. The solution? Test the unfiltered version and hope it reveals worst-case scenarios before they escape.
Who’s In and Who’s Not
The May 5 announcements added three companies to CAISI’s roster: Microsoft, Google DeepMind, and xAI. OpenAI and Anthropic already had similar agreements from August 2024. Five of the major AI players are now participating.
The notable absence? Meta. Despite developing the widely-used Llama models, Meta hasn’t announced any CAISI agreement. The reason likely lies in Meta’s open-source approach—pre-deployment testing doesn’t work if you’re releasing model weights publicly. This creates a philosophical split: closed-source AI companies can commit to government review, but open-source developers face a different paradigm.
“NIST-tested” could become marketing language. Companies that cooperate might get early feedback on security improvements. Those that don’t may face public pressure or regulatory consequences.
Impact on Developers
If you’re building with frontier AI models, expect longer release cycles. Pre-deployment government review adds time. CAISI evaluates multiple risk categories, coordinates with security agencies, and likely requires iterations when issues are found.
These agreements are currently voluntary. But voluntary becomes mandatory when every major competitor signs on. Market pressure and political scrutiny create de facto requirements. The self-regulation era is over. What comes next is formal oversight, probably through legislation that codifies what these voluntary agreements establish.
For developers, this means new compliance requirements, uncertain timelines, and a shifting definition of what constitutes a “frontier” model. The question isn’t whether AI development will slow down—it’s by how much, and whether safety benefits justify the cost.
From Self-Regulation to Oversight
This is how AI regulation begins: not with sweeping legislation, but with voluntary agreements signed after a specific crisis. Mythos proved that AI capabilities can outpace safety measures. The unauthorized access incident proved that even restricted models can escape. The agreements signed May 5 acknowledge that self-policing isn’t sufficient.
Whether this makes AI safer or just slower remains to be seen. But the age of “move fast and break things” is incompatible with models that can break operating systems and find security flaws faster than humans can fix them. The government is now in the loop. Developers should plan accordingly.








