文心一言 4.5
Change Analysis · 2026 Week27
ERNIE Bot 4.5 2026 Week27 Code Execution (v5) dimension dropped 23.9 pts
Score Comparison
76.5
69.6
-6.9
| Dimension | Previous | Current | Change |
|---|---|---|---|
| Code Execution (v5) | 69.7 | 45.8 | -23.9 |
| Knowledge Synthesis (v5) | 66.7 | 66.6 | -0.1 |
| Grounding (v5) | 93.4 | 93.7 | +0.3 |
| Value | 99.2 | 98.7 | -0.5 |
| Stability | 35 | 26.7 | -8.3 |
| Availability | 100 | 100 | 0 |
All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.
Back to Movers