Skip to main content

GPT-o3

Change Analysis · 2026 Week27

GPT-o3 2026 Week27 Code Execution (v5) dimension dropped 19.1 pts

Score Comparison

75.4 69.7 -5.7
Dimension Previous Current Change
Code Execution (v5) 87.2 68.1 -19.1
Knowledge Synthesis (v5) 89.4 89 -0.4
Grounding (v5) 93.6 94.9 +1.3
Value 10.8 10.2 -0.6
Stability 57.8 51 -6.8
Availability 98 95.9 -2.1

All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.

Back to Movers