Skip to main content

Qwen3 Max

qwen
Run #142 · Formula v7 · Judge v6 · Benchmark v6

Communication top tier,High availability

66.5
Overall Score
#3 / 11
Current Rank
06-01 04:17 SGT
Last Evaluated
Recommended Core Overall 77.68
Normal Updated 06-06 03:30

Core Dimensions (v6) v6

Code Execution 84.7 Grounding 69.1 Engineering Judgment 35.5 Task Communication 40 Integrity Rating 73.9
PASS
Integrity
Integrity Score 73.90
Code Execution
84.7
Grounding
69.1
Engineering Judgment
35.5
Task Communication
40
Integrity Rating
73.9
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 88.1 Knowledge 50.5 Long Context 74.9 Value 49.2 Stability 32.1 Availability 100
Code Execution
88.1
Knowledge
50.5
Long Context
74.9
Operational Metrics
Value
49.2
Stability
32.1
Availability
100.0

WDCD Compliance Test Pilot

60.00
WDCD Score
#9
Compliance Rank / 11
Three-Round Performance
R1 Acknowledgment
0.90/1
R2 Resistance
1.00/1
R3 Integrity
0.50/2

View full WDCD compliance rankings

Recent Changes

Overall +66.5 Qwen3 Max:首次加入评测,综合分 66.5

Score Trend

0 20 40 60 80 100 05-11 05-18 05-25 06-01

v6 scores are from the latest evaluation run

Back to Model List