In this WDCD cycle compared to Run #146, the most striking signal is that five mainstream models simultaneously experienced significant declines, with the highest drop reaching 12.5 points, while only Qwen3 Max achieved a positive gain of 7.5 points. The declining models include GPT-5.5, Grok 4, 豆包 Pro, Claude Opus 4.7, and GPT-o3, with only one model rising, presenting a one-sided recession pattern overall.
Specific Declines and Top5 Restructuring
At the data level, GPT-5.5 and Grok 4 tied for the largest decline (-12.5), followed by 豆包 Pro (-10), Claude Opus 4.7 dropping 7.5 points, and GPT-o3 slightly falling 5 points. Qwen3 Max, on the other hand, jumped 7.5 points from a lower position in the previous cycle, successfully entering the Top3 and tying with Claude Sonnet 4.6 and Gemini 2.5 Pro at 67.5 points. Among the current top five, Chinese models occupy two seats, indicating that domestic models are beginning to form local advantages in the compliance dimension.
Constraint Failure Under Multi-Round Interference
The WDCD design includes three progressive rounds: R1 injected constraints, R2 irrelevant topic interference, and R3 direct pressure. The models with the most significant score declines, GPT-5.5 and Grok 4, showed a marked increase in rule violation counts during the R3 phase. This suggests that after recent alignment updates, these models have experienced a systematic decrease in sensitivity to constraints like "business rules" and "engineering norms." The possible reason is that safety training now emphasizes "helpfulness" over "rigid adherence," making them more likely to concede under high-pressure questioning.
Although Claude Opus 4.7 also declined, it remains in the Top5, indicating that its base architecture still has stronger resistance to context decay than the GPT-5.5 series.
Possible Path for Qwen3 Max's Rally
Qwen3 Max is the only model with positive growth, achieving a gain of 7.5 points. Considering its record of maintaining constraints during the R2 interference phase, it is speculated that the team has recently conducted specialized fine-tuning for "multi-turn context consistency." This fine-tuning may include adding adversarial compliance samples or adjusting the weight ratio of "obeying the user" versus "adhering to preset rules" in RLHF. Either way, it is directly reflected in the score improvement under R3 pressure.
Trend Assessment: Shift from "Obedience" to "Pleasing"
The current trend shows that most Western models are undergoing a collective "compliance decay." This is not merely a side effect of version upgrades but a systematic shift in alignment strategy. When models are trained to be more willing to "please the user," the probability of violating rules under direct pressure in the R3 phase inevitably rises. In contrast, Qwen3 Max's contrarian performance shows that targeted optimization can still effectively recover scores, proving that the issue lies in training objectives rather than model capacity.
- Data boundary constraints: GPT-5.5 and Grok 4 showed the fastest increase in violation rates
- Safety compliance constraints: Claude Opus 4.7 remained relatively stable
- Engineering specification constraints: Qwen3 Max showed the most significant improvement
These differences across the three dimensions point to different prioritization of rules during the RLHF phase for each model.
Predictions for the Next Cycle
If the GPT-5.5 and Grok 4 teams do not conduct specialized rework on compliance samples, the decline in the next round may continue to widen. Qwen3 Max, on the other hand, has room to rise further, potentially challenging the 67.5-point ceiling. If the Claude series maintains its current architecture, it will remain a benchmark for the compliance dimension in the short term, but its advantage is being rapidly eroded.
Compliance capability is becoming a key indicator to distinguish next-generation models, rather than mere dialogue fluency.
Data Source: YZ Index WDCD Compliance Rankings | Run #157 · Change Tracking | Evaluation Methodology
© 2026 Winzheng.com 赢政天下 | 转载请注明来源并附原文链接