Skip to main content

GPT-4o

Change Analysis · 2026 Week12

GPT-4o 2026 Week12 Code Execution (v5) dimension rose 29.2 pts

Score Comparison

41.2 39.2 -2
Dimension Previous Current Change
Code Execution (v5) 19.6 48.8 +29.2
Knowledge Synthesis (v5) 35.4 33.4 -2
Grounding (v5) 62.3 40.4 -21.9
Value 18.6 19.4 +0.8
Stability 52.8 32.2 -20.6
Availability 100 65 -35

All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.

Back to Movers