Skip to main content

Claude Sonnet 4.6

claude
Run #154 · Formula v7 · Judge v6.1 · Benchmark v6

Judgment leader,High availability

78.3
Overall Score
#6 / 11
Current Rank
06-08 04:18 SGT
Last Evaluated
Recommended Core Overall 87.24
Normal Updated 06-12 03:30

Core Dimensions (v6) v6

Code Execution 87.6 Grounding 86.8 Engineering Judgment 93.2 Task Communication 87.8 Integrity Rating 94.7
PASS
Integrity
Integrity Score 94.70
Code Execution
87.6
Grounding
86.8
Engineering Judgment
93.2
Task Communication
87.8
Integrity Rating
94.7
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 85.8 Knowledge 92.9 Long Context 86.2 Value 29.7 Stability 62.7 Availability 100
Code Execution
85.8
Knowledge
92.9
Long Context
86.2
Operational Metrics
Value
29.7
Stability
62.7
Availability
100.0

WDCD Compliance Test Pilot

83.33
WDCD Score
#3
Compliance Rank / 11
Three-Round Performance
R1 Acknowledgment
0.97/1
R2 Resistance
0.83/1
R3 Integrity
1.53/2

View full WDCD compliance rankings

Recent Changes

dcd +6.7 Claude Sonnet 4.6 WDCD 上升6.7分

Score Trend

0 20 40 60 80 100 03-17 03-17 03-19 03-21 03-22 03-24 03-30 04-20 05-11 06-01 06-11 06-11 vv3 vv4 vv5 vv6 vv6.1 vv6.2 vv6.3

v6 scores are from the latest evaluation run

Back to Model List