What's actually happening with AI?: Cautious builders
New to ai alignment and safety
On a Tuesday in October 2024, Anthropic published Claude’s character spec — the actual internal reasoning about what kind of entity they were trying to build. Page after page of agonized trade-offs.
Reading the load-bearing calculations
We read that document the way a structural engineer reads bridge calculations. With respect, and with a pen, because respecting the work means checking it.
We find the accelerationists’ bravado as alarming as the pause advocates’ paralysis. In March 2024, a health-tech startup deployed an AI triage system in four rural ERs. Ninety-four percent accuracy on test data. In production, it systematically undertriaged atypical cardiac events — disproportionately in women and patients under fifty. Three patients sent home with ibuprofen. Two returned within forty-eight hours in critical condition. That is what building looks like. You find the failure mode that did not appear in testing because the training data reflected the same biases.
The alignment question is real. The accelerationists argue alignment follows capability the way a wake follows a ship. We argue nothing demands that. Alignment is unsolved, and unsolved problems do not resolve on convenient schedules. Leaded gasoline persisted sixty years after its toxicity was documented because the profiting industry funded the science that delayed the reckoning.
The grandmother navigating Medicare through ChatGPT is our user. When the model hallucinates a nonexistent Medicare provision, that is our failure. Trust-based design — systems optimizing for long-term judgment rather than short-term attention — is the alternative architecture.
Where we concede ground: Our caution has a body count too. Every month spent on safety evals is a month AI diagnostics are not catching cancers.
What would change our mind: An adaptive international governance framework that moves at model-development speed with enforcement teeth.
Read the full synthesis: What’s actually happening with AI?