Skip to main content

文心一言 4.0

ernie
Run #87 · Formula v7 · Judge v6 · Benchmark v6

Communication top tier,High availability

72.0
Overall Score
#3 / 11
Current Rank
04-27 04:18 SGT
Last Evaluated
Recommended Core Overall 74.89
Normal Updated 04-04 03:30

Core Dimensions (v6) v6

Code Execution 77 Grounding 72.3 Engineering Judgment 39.7 Task Communication 40 Integrity Rating 69.2
PASS
Integrity
Integrity Score 69.20
Code Execution
77
Grounding
72.3
Engineering Judgment
39.7
Task Communication
40
Integrity Rating
69.2
Show v5 legacy dimensions

Legacy Dimensions (v5) legacy

Code Execution 82.1 Knowledge 46.1 Long Context 80.4 Value 98.6 Stability 31.3 Availability 100
Code Execution
82.1
Knowledge
46.1
Long Context
80.4
Operational Metrics
Value
98.6
Stability
31.3
Availability
100.0

Recent Changes

communication_raw +15 文心一言 4.0:任务表达 +15

Score Trend

0 20 40 60 80 100 03-21 03-21 03-21 03-21 03-21 03-22 03-22 03-24 03-24 03-24 03-24 03-25 03-30 04-06 04-13 04-20 04-27 vv6

v6 scores are from the latest evaluation run

Back to Model List