GPT-o3
Change Analysis · 2026 Week26
GPT-o3 2026 Week26 Knowledge Synthesis (v5) dimension rose 3.5 pts
Score Comparison
74.2
75.4
+1.2
| Dimension | Previous | Current | Change |
|---|---|---|---|
| Code Execution (v5) | 86.7 | 87.2 | +0.5 |
| Knowledge Synthesis (v5) | 85.9 | 89.4 | +3.5 |
| Grounding (v5) | 94.2 | 93.6 | -0.6 |
| Value | 10.6 | 10.8 | +0.2 |
| Stability | 55.7 | 57.8 | +2.1 |
| Availability | 98 | 98 | 0 |
All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.
Back to Movers