Dimension Drop
Severity 10/10
2026-W12
GPT-o3 Availability下跌 31 分
Score Comparison
| Dimension | Previous | Current | Change |
|---|---|---|---|
| Overall (v5) | 39.0 | 34.5 | -4.5 |
| Code Execution (v5) | 20.2 | 43.4 | +23.2 |
| Knowledge Synthesis (v5) | 34.4 | 35.8 | +1.4 |
| Grounding (v5) | 62.3 | 28.8 | -33.5 |
| Value | 4.7 | 4.3 | -0.4 |
| Stability | 53.0 | 28.0 | -25 |
| Availability | 100.0 | 69.0 | -31 |
Affected Dimensions
Availability
Run #37 · Formula v5 · Judge v6 · Benchmark v5.1 · 2026-03-22 14:26 SGT
View GPT-o3 Full Profile