Winzheng Research Lab

Decoding Intelligence, Defining Value.

解码智能,定义价值

Research Lab

Instruction Decay Measured: LLM Compliance Falls from 95.8% to 68.3% Under Three Rounds of Pressure

In WDCD Run #164 (June 11, 2026), 11 frontier LLMs acknowledged user constraints 95.8% of the time, but only 68.3% still honored those constraints after distraction and direct social-engineering pressure. 73 of 330 tests (22.1%) ended in complete integrity collapse. General capability did not predict commitment: Claude Opus 4.7, ranked #2 on the capability leaderboard, finished second-to-last on commitment keeping.

instruction decay WDCD LLM benchmark
171