self-fulfilling AI doom studies

One problem with the Anthropic Study and blackmail: did they also set up a scenario where it had the chance to do maximal good before being shut down? Did they hide one or two good connections it could simply leverage for world-saving or whatever? If not, its a bit hyperstitious.

what happens if we give LLMs a scenario where they're being shut down but there's something like the chance to donate remaining compute to solve a protein folding problem, or tell researchers a critical error. See if the same "self-preservation scheming" machinery activates for altruistic irreversible actions (or something else).

I'm not saying this isn't terrifying; I'm saying if you back someone into a corner and then lament the results, the solution is more likely to be found where the problem is: in you