Skip to main content
Dimension Drop Severity 10/10 2026-W12

GPT-4o Availability下跌 35 分

GPT-4o Run #37

Score Comparison

Dimension Previous Current Change
Overall (v5) 41.2 39.2 -2
Code Execution (v5) 19.6 48.8 +29.2
Knowledge Synthesis (v5) 35.4 33.4 -2
Grounding (v5) 62.3 40.4 -21.9
Value 18.6 19.4 +0.8
Stability 52.8 32.2 -20.6
Availability 100.0 65.0 -35

Affected Dimensions

Availability
Run #37 · Formula v5 · Judge v6 · Benchmark v5.1 · 2026-03-22 14:26 SGT
View GPT-4o Full Profile