Skip to main content

Grok 3

Change Analysis · 2026 Week15

Grok 3 2026 Week15 Code Execution (v5) dimension dropped 14.4 pts

Score Comparison

65.6 60.5 -5.1
Dimension Previous Current Change
Code Execution (v5) 91.2 76.8 -14.4
Knowledge Synthesis (v5) 51.6 49.6 -2
Grounding (v5) 83.8 83.1 -0.7
Value 25.1 22.9 -2.2
Stability 35.9 30.4 -5.5
Availability 100 100 0

All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.

Back to Movers