Skip to main content

Claude Opus 4.6

claude
Run #87 · Formula v7 · Judge v6 · Benchmark v6

Communication top tier,High availability

62.8
Overall Score
#9 / 11
Current Rank
04-27 04:18 SGT
Last Evaluated
Recommended Core Overall 83.44
Normal Updated 04-04 03:30

Core Dimensions (v6) v6

Code Execution 86.5 Grounding 79.7 Engineering Judgment 46.3 Task Communication 40 Integrity Rating 67.5
PASS
Integrity
Integrity Score 67.50
Code Execution
86.5
Grounding
79.7
Engineering Judgment
46.3
Task Communication
40
Integrity Rating
67.5
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 92.8 Knowledge 50.1 Long Context 85.4 Value 5.1 Stability 35.2 Availability 100
Code Execution
92.8
Knowledge
50.1
Long Context
85.4
Operational Metrics
Value
5.1
Stability
35.2
Availability
100.0

Recent Changes

grounding_raw +13.3 Claude Opus 4.6:材料约束 +13.3

Score Trend

0 20 40 60 80 100 03-17 03-17 03-17 03-19 03-21 03-21 03-22 03-24 03-24 03-30 04-13 04-27 vv3 vv4 vv5 vv6

v6 scores are from the latest evaluation run

Back to Model List