Skip to main content

Qwen Max

qwen
Run #87 · Formula v7 · Judge v6 · Benchmark v6

Communication top tier,High availability

65.4
Overall Score
#8 / 11
Current Rank
04-27 04:18 SGT
Last Evaluated
Recommended Core Overall 77.91
Major Anomaly Updated 04-04 03:30

Core Dimensions (v6) v6

Code Execution 78.4 Grounding 77.3 Engineering Judgment 40.7 Task Communication 40 Integrity Rating 65.8
PASS
Integrity
Integrity Score 65.80
Code Execution
78.4
Grounding
77.3
Engineering Judgment
40.7
Task Communication
40
Integrity Rating
65.8
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 80.4 Knowledge 49.4 Long Context 82.5 Value 48.6 Stability 32.7 Availability 100
Code Execution
80.4
Knowledge
49.4
Long Context
82.5
Operational Metrics
Value
48.6
Stability
32.7
Availability
100.0

Recent Changes

communication_raw +15 Qwen Max:任务表达 +15

Score Trend

0 20 40 60 80 100 03-17 03-17 03-17 03-19 03-21 03-21 03-22 03-24 03-24 03-30 04-13 04-27 vv3 vv4 vv5 vv6

v6 scores are from the latest evaluation run

Back to Model List