What's actually happening with AI?: Pause advocates

The builders signed the letter

In May 2023, three hundred and fifty AI researchers signed a one-sentence statement: mitigating extinction risk from AI should be a global priority alongside pandemics and nuclear war. The signatories included the CEOs of OpenAI, DeepMind, and Anthropic. They used the word extinction. They were not misquoted.

The people building the gun signed a letter saying the gun might be pointed at the species.

The alignment problem compounds. Every capability advance widens the gap between what the system can do and what we can verify about its intentions. GPT-4 developed emergent capabilities — multi-step reasoning, tool use — that its designers did not anticipate and cannot fully explain. When the deeper analysis examined what alignment requires at a metaphysical level, the conclusion was not reassuring: we may be instilling values in a system whose relationship to values is fundamentally unlike our own.

The cautious builders check the gauges. Gauge-checking assumes you know which gauges matter. The failure modes we fear are the ones that emerge from capabilities nobody predicted in systems nobody fully understands. You cannot check a gauge that does not exist yet.

The accelerationists invoke the arms race: if we pause, China speeds up. We have heard this logic. It produced 70,000 nuclear warheads. The nuclear non-proliferation regime eventually constrained everyone. Imperfectly. We are still here.

Where we concede ground: We have cried wolf. Yudkowsky warned AGI could arrive by 2018. The decade passed. Every missed prediction weakens the next.

What would change our mind: Three frontier generations with full interpretability, reproduced by independent researchers, and no emergent goal-seeking.

Read the full synthesis: What’s actually happening with AI?