Skip to main content
Dimension Drop Severity 10/10 2026-W12

Grok 3 Stability下跌 22.5 分

Grok 3 Run #37

Score Comparison

Dimension Previous Current Change
Overall (v5) 42.5 56.2 +13.7
Code Execution (v5) 22.5 64.9 +42.4
Knowledge Synthesis (v5) 38.8 44.8 +6
Grounding (v5) 64.5 83.0 +18.5
Value 13.9 21.1 +7.2
Stability 54.2 31.7 -22.5
Availability 100.0 100.0 +0

Affected Dimensions

Stability
Run #37 · Formula v5 · Judge v6 · Benchmark v5.1 · 2026-03-22 14:26 SGT
View Grok 3 Full Profile