The bar exam nobody studied for
GPT-4 passed the Uniform Bar Examination in the ninetieth percentile. Nobody specifically trained it on legal reasoning. It had been trained on text. The law was in the text. The capability was emergent. Nobody predicted it at that level on that timeline.
We treat p(doom) — the probability of existential catastrophe from AI — as a number you should take as seriously as a cancer diagnosis. Not hysterical. Clinical. The honest disagreement among researchers is whether it is 5 percent or 50 percent. That the range is that wide should itself alarm you.
We are building systems whose capabilities we cannot predict, whose internal reasoning we cannot fully inspect, and whose alignment with human values we cannot verify — deploying them at civilizational scale with all three problems unsolved. When builders cannot explain why a system does what it does, claiming they can guarantee what it will not do is not engineering. It is faith.
The accelerationists counter with productivity gains. We do not deny the benefits. We deny the inference. Optimizing for near-term benefit while ignoring catastrophic tail risk is the logic that built every industrial disaster. The alignment problem is not a thought experiment. MIT researchers demonstrated an AI model generating chemical weapons pathways in under six hours.
Current alignment methods fail by definition when the system becomes more capable than the humans providing feedback. We are approaching that threshold.
Where we concede ground: We have predicted imminent catastrophe on timelines that repeatedly did not materialize. Repeated false alarms erode genuine warnings.
What would change our mind: Two more frontier generations developed with full interpretability and transparent reasoning under independent review.
Read the full synthesis: How scared should we be of AI?