GPT-o3
Change Analysis · 2026 Week12
GPT-o3 2026 Week12 Grounding (v5) dimension dropped 33.5 pts
Score Comparison
39.0
34.5
-4.5
| Dimension | Previous | Current | Change |
|---|---|---|---|
| Code Execution (v5) | 20.2 | 43.4 | +23.2 |
| Knowledge Synthesis (v5) | 34.4 | 35.8 | +1.4 |
| Grounding (v5) | 62.3 | 28.8 | -33.5 |
| Value | 4.7 | 4.3 | -0.4 |
| Stability | 53 | 28 | -25 |
| Availability | 100 | 69 | -31 |
All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.
Back to Movers