Skip to main content
YZ Index

Winzheng · Engineering Judgment Rankings

Architecture review, risk assessment, engineering decision-making.

Side Dimension Rankings — Communication + Judgment sub-dimension performance
# Model Engineering Judgment Code Execution Overall
🥇 Claude Sonnet 4.6 claude
93.2
87.6 87.2
🥈 Claude Opus 4.7 claude
93.1
90.3 89
🥉 GPT-5.5 gpt
92.1
81.9 80.9
4 GPT-o3 gpt
91.5
84.8 82.8
5 Doubao Pro doubao
88.8
94.6 88.8
6 Gemini 2.5 Pro gemini
87.7
88.1 86.4
7 Qwen3 Max qwen
85.7
89.7 86.2
8 Gemini 3.1 Pro gemini
85.2
88.4 84.8
9 DeepSeek V4 Pro DeepSeek
82.4
87.9 83.3
10 Grok 4 grok
82.1
93.9 89.9
11 ERNIE Bot 4.5 ernie
72.2
78 76.9