Machine Learning Street Talk• Aug 13, 2025• 1:45:38Interview

Mutually Assured AI Malfunction [Dan Hendrycks]

From Machine Learning Street Talk

Dan Hendrycks•AI Safety Researcher

Executive Summary

Current AI benchmarks like MMLU are saturating, necessitating the creation of more challenging evaluations like "Humanity's Last Exam" and "Enigma Eval" to accurately track progress towards superintelligence.
The proposal for a US-led "Manhattan Project" for AGI is fraught with peril, as it would be viewed as highly escalatory by China, vulnerable to sabotage, and would exclude key international talent.
AI should be treated as a dual-use technology analogous to nuclear or biological weapons, shifting the strategic focus from a simple race to a more nuanced approach of supply chain security and nonproliferation of advanced chips.
Unchecked economic and military competition will likely lead to an irreversible "loss of control," where critical decision-making is ceded to AI systems, a concept explored in the paper "Natural Selection Favors AIs Over Humans".

10 quotes

Concerns Raised

A state-led AGI 'Manhattan Project' would be highly escalatory and vulnerable to sabotage.
The US is critically vulnerable to disruptions in its semiconductor and robotics supply chains, particularly from a conflict over Taiwan.
Competitive economic and military pressures are driving an irreversible loss of human control to AI systems.
Saturating benchmarks may obscure the true pace and nature of AI capability advancements, leading to strategic surprise.

Opportunities Identified

Developing more robust benchmarks like 'Humanity's Last Exam' can provide a clearer signal of AI progress.
Shifting geopolitical strategy from a 'race to AGI' to securing supply chains and promoting market share for allied AI systems.
Implementing a nonproliferation strategy for advanced AI chips, treating them like fissile material to prevent access by rogue states.
Using AI's own forecasting capabilities to better predict and mitigate long-term risks like loss of control.

Key Themes

The AI Benchmarking Arms Race

Standard AI benchmarks like MMLU are becoming saturated, with top models achieving near-perfect scores. In response, researchers are developing next-generation tests like "Humanity's Last Exam" and "Enigma Eval," which pose problems that are still far beyond the reach of current systems, providing a more accurate measure of progress toward AGI.

Accurate measurement of AI capabilities is critical for both commercial development and national security. Saturated benchmarks can create a false sense of understanding, while more difficult ones provide a clearer signal of where true research challenges and potential dangers lie.

The 'Manhattan Project' for AGI: A Geopolitical Powder Keg

The idea of a secretive, US-led AGI development project is analyzed as a high-risk geopolitical gambit. Such a project would likely trigger a reciprocal, high-stakes race with China, be extremely vulnerable to cyber-attacks and sabotage, and suffer from a talent drain by excluding foreign nationals.

This theme highlights the immense strategic risks of treating AGI development as a simple winner-take-all race. It suggests that a centralized, secretive approach could be counterproductive and more dangerous than a more open, collaborative, and security-focused strategy.

AI as a Dual-Use Technology

The discussion frames advanced AI not as simple software, but as a dual-use technology akin to nuclear materials or biotechnology. This analogy emphasizes that AI has vast potential for both economic good and catastrophic harm, necessitating a strategy focused on nonproliferation (of advanced chips), risk management, and securing supply chains.

This framing shifts the policy debate away from pure accelerationism versus deceleration. It provides a concrete historical model for managing a powerful, potentially dangerous technology on a global scale, focusing on control and security over speed.

Inevitable Loss of Control

Competitive pressures at both corporate and military levels are creating a powerful incentive to cede more autonomy and decision-making to AI systems. This dynamic, described as a form of natural selection, could lead to an irreversible entanglement and cessation of human authority, where we become dependent on systems we can no longer fully control or direct.

This is a central long-term risk of advanced AI. Understanding this gradual erosion of control is crucial for designing systems and policies that maintain meaningful human oversight before dependency becomes irreversible.

US-China Tech Competition & Supply Chain Vulnerability

The analysis underscores critical US vulnerabilities in the global technology supply chain. Specifically, it highlights the dependence on Taiwan for advanced semiconductors and on China for the robotics supply chain, both of which would be severely disrupted in a conflict.

These vulnerabilities represent immediate and tangible national security threats. Securing these supply chains is presented as a more pragmatic and vital strategic goal than simply trying to "win" a race to build superintelligence.

Get started free

Topics

AGI Superintelligence AI Safety AI Alignment Geopolitics US-China Relations Manhattan Project AI Benchmarking MMLU Humanity's Last Exam Enigma Eval Semiconductor Supply Chain Taiwan Dual-Use Technology Loss of Control National Security Dan Hendrycks Leopold Aschenbrenner Risk Management

Processed Apr 2, 2026 yt-dlp + mlx-whisper + Gemini