Skip to main content

Grok 4

Change Analysis · 2026 Week27

Grok 4 2026 Week27 Code Execution (v5) dimension dropped 16.2 pts

Score Comparison

76.9 71.1 -5.8
Dimension Previous Current Change
Code Execution (v5) 84.3 68.1 -16.2
Knowledge Synthesis (v5) 87.4 86.3 -1.1
Grounding (v5) 95.6 95.7 +0.1
Value 28.9 27.6 -1.3
Stability 53 41.5 -11.5
Availability 100 99 -1

All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.

Back to Movers