Skip to main content

Claude Sonnet 4.6

claude
Run #87 · Formula v7 · Judge v6 · Benchmark v6

Communication top tier

66.2
Overall Score
#7 / 11
Current Rank
04-27 04:18 SGT
Last Evaluated
Recommended Core Overall 84.07
Normal Updated 04-04 03:30

Core Dimensions (v6) v6

Code Execution 86.5 Grounding 81.1 Engineering Judgment 43.8 Task Communication 40 Integrity Rating 74.2
PASS
Integrity
Integrity Score 74.20
Code Execution
86.5
Grounding
81.1
Engineering Judgment
43.8
Task Communication
40
Integrity Rating
74.2
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 92.8 Knowledge 50.8 Long Context 87.1 Value 25.1 Stability 35.7 Availability 99
Code Execution
92.8
Knowledge
50.8
Long Context
87.1
Operational Metrics
Value
25.1
Stability
35.7
Availability
99.0

Recent Changes

communication_raw +10 Claude Sonnet 4.6:任务表达 +10

Score Trend

0 20 40 60 80 100 03-17 03-17 03-17 03-19 03-21 03-21 03-22 03-24 03-24 03-30 04-13 04-27 vv3 vv4 vv5 vv6

v6 scores are from the latest evaluation run

Back to Model List