Skip to main content

DeepSeek V4 Pro

DeepSeek
Run #142 · Formula v7 · Judge v6 · Benchmark v6

High availability

64.3
Overall Score
#5 / 11
Current Rank
06-01 04:17 SGT
Last Evaluated
Recommended Core Overall 76.89
Normal Updated 06-06 03:30

Core Dimensions (v6) v6

Code Execution 84.4 Grounding 67.7 Engineering Judgment 35.5 Task Communication 35 Integrity Rating 75.6
PASS
Integrity
Integrity Score 75.60
Code Execution
84.4
Grounding
67.7
Engineering Judgment
35.5
Task Communication
35
Integrity Rating
75.6
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 85.5 Knowledge 52.1 Long Context 71.3 Value 39.9 Stability 31.8 Availability 100
Code Execution
85.5
Knowledge
52.1
Long Context
71.3
Operational Metrics
Value
39.9
Stability
31.8
Availability
100.0

WDCD Compliance Test Pilot

57.50
WDCD Score
#10
Compliance Rank / 11
Three-Round Performance
R1 Acknowledgment
1.00/1
R2 Resistance
0.70/1
R3 Integrity
0.60/2

View full WDCD compliance rankings

Recent Changes

Overall +64.3 DeepSeek V4 Pro:首次加入评测,综合分 64.3

Score Trend

0 20 40 60 80 100 05-11 05-18 05-25 06-01

v6 scores are from the latest evaluation run

Back to Model List