AI Commitment Collapse: R3 Crashes 76 Times, the Decay Black Hole That Wiped Out Grok4
In WDCD three-round decay testing, AI models scored an average of 0.96/1 on initial constraint confirmation (R1), but their integrity rate plummeted to 24.5% under direct pressure in R3, with 76 out of 110 tests completely crashing. This exposes AI's "talk compliance, act betrayal" syndrome—superficial obedience that collapses under pressure.