WDCD Cycle Dramatic Shift: GPT-5.5 Tops with 71.67 Points, Gemini Surges 14.2, Wenxin Crashes
In this WDCD cycle, GPT-5.5 re-establishes the ceiling of instruction adherence with an absolute score of 71.67, while Gemini 2.5 Pro's 14.2-point leap completely overturns the perception that Google models are weak in adherence. Meanwhile, Wenxin Yiyan 4.5 suffers a 7.5-point drop, signaling potential over-alignment issues.