In the June 29, 2026 Winzheng YZ Index Smoke Lite test, Claude Opus 4.7 ranked first with a perfect 100 on the main leaderboard, 100 on execution, and 100 on compliance [pass], achieving full marks in both execution and compliance dimensions.
Structural Features of the Perfect-Score Model
Claude Opus 4.7's 0.55×100 + 0.45×100 combination directly hits 100 points. Grok 4 also scores 100 on execution but only 96.7 on compliance, achieving 98.52 on the main leaderboard—the gap lies solely in material compliance. DeepSeek V4 Pro and Claude Sonnet 4.6 both score 100 on execution, with compliance scores of 95.4 and 95.2 respectively, and the main leaderboard difference is less than 0.1 points.
Yesterday's Execution Leap and Today's Structure
Claude Opus 4.7 scored only 50 on execution yesterday, but jumped to 100 today, rising 28.5 points on the main leaderboard. Claude Sonnet 4.6 also rose from 50 to 100 on execution, increasing 27.3 points on the main leaderboard. Both models' compliance scores did not fluctuate by the same magnitude, indicating that this improvement primarily comes from the code execution dimension.
Analysis of Doubao Pro's Abnormal Drop
Doubao Pro scored 84.77 on the main leaderboard, 75 on execution, and 96.7 on compliance, down 13.8 points from yesterday. Its execution score is far below the 95+ level of the top five, while its compliance score is close to Grok 4's 96.7. With a low execution weight of 0.55, the overall main leaderboard is significantly dragged down.
Execution-Compliance Combinations of Other Models
Gemini 3.1 Pro scores 95 on execution and 95 on compliance, giving it a balanced main leaderboard score of 95. Qwen3 Max scores 87.7 on execution and 95.2 on compliance, achieving 91.08 on the main leaderboard, with compliance outperforming execution. Wenxin Yiyan 4.5 scores 89.6 on execution and 86.5 on compliance, resulting in 88.21 on the main leaderboard, both in the mid-range. Gemini 2.5 Pro scores 75 on execution and 100 on compliance, achieving 86.25 on the main leaderboard—perfect compliance but significantly hampered by execution.
Qwen3 Max's integrity rating changed from warn to pass, rising 21.1 points on the main leaderboard, with execution improving 37.7 points from yesterday's level. Wenxin Yiyan 4.5 rose 26.7 points on the main leaderboard, with execution improving 54 points and compliance dropping 6.7 points.
Claude Opus 4.7 scored a perfect 100 today, while Doubao Pro's structural weakness of 75 on execution continues to widen the gap.
Data source: YZ Index | Run #203 | View raw data
© 2026 Winzheng.com 赢政天下 | 转载请注明来源并附原文链接