Skip to main content

Grok 3

grok
Run #87 · Formula v7 · Judge v6 · Benchmark v6

Overall #1,Grounding leader,Communication top tier

67.7
Overall Score
#6 / 11
Current Rank
04-27 04:18 SGT
Last Evaluated
Recommended Core Overall 86.88
Normal Updated 04-04 03:30

Core Dimensions (v6) v6

Code Execution 88.9 Grounding 84.4 Engineering Judgment 43.5 Task Communication 40 Integrity Rating 77.5
PASS
Integrity
Integrity Score 77.50
Code Execution
88.9
Grounding
84.4
Engineering Judgment
43.5
Task Communication
40
Integrity Rating
77.5
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 95.5 Knowledge 51.5 Long Context 91 Value 25.8 Stability 35.5 Availability 99
Code Execution
95.5
Knowledge
51.5
Long Context
91.0
Operational Metrics
Value
25.8
Stability
35.5
Availability
99.0

Recent Changes

communication_raw +10 Grok 3:任务表达 +10

Score Trend

0 20 40 60 80 100 03-21 03-21 03-21 03-21 03-21 03-22 03-22 03-24 03-24 03-24 03-24 03-25 03-30 04-06 04-13 04-27 vv6

v6 scores are from the latest evaluation run

Back to Model List