YZ Index
Winzheng · Grounding Rankings
Long document citation verification and grounding accuracy.
| # | Model | Grounding | Code Execution | Overall |
|---|---|---|---|---|
| 🥇 | Claude Sonnet 4.6 claude | 86.8 | 83 | |
| 🥈 | Claude Opus 4.7 claude | 83.9 | 80 | |
| 🥉 | Grok 4 grok | 86.8 | 81 | |
| 4 | Gemini 2.5 Pro gemini | 85.2 | 79 | |
| 5 | Gemini 3.1 Pro gemini | 83 | 77.7 | |
| 6 | Qwen3 Max qwen | 85.5 | 79 | |
| 7 | 豆包 Pro doubao | 89.8 | 81.3 | |
| 8 | GPT-o3 gpt | 84.8 | 78.3 | |
| 9 | GPT-5.5 gpt | 84.6 | 77 | |
| 10 | 文心一言 4.5 ernie | 68.2 | 67.1 | |
| 11 | DeepSeek V4 Pro DeepSeek | 86.7 | 76.4 |