Dimension Drop
Severity 10/10
2026-W12
GPT-o3 Grounding (v5)下跌 33.5 分
Score Comparison
| Dimension | Previous | Current | Change |
|---|---|---|---|
| Overall (v5) | 39.0 | 34.5 | -4.5 |
| Code Execution (v5) | 20.2 | 43.4 | +23.2 |
| Knowledge Synthesis (v5) | 34.4 | 35.8 | +1.4 |
| Grounding (v5) | 62.3 | 28.8 | -33.5 |
| Value | 4.7 | 4.3 | -0.4 |
| Stability | 53.0 | 28.0 | -25 |
| Availability | 100.0 | 69.0 | -31 |
Affected Dimensions
Grounding (v5)
Top Lost Tasks 5
#1
Root Cause Analysis and Evidence Boundaries
Grounding (v5)
66.7
0
-66.7
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29516, Requested 800. Please try again in 632ms. Visit https://platform.openai.com/account/rate-limits to learn more.
#2
Breaking Changes List
Grounding (v5)
66.7
0
-66.7
Strict
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29529, Requested 675. Please try again in 408ms. Visit https://platform.openai.com/account/rate-limits to learn more.
#3
Customer Migration Risk Assessment
Grounding (v5)
66.7
0
-66.7
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29393, Requested 677. Please try again in 140ms. Visit https://platform.openai.com/account/rate-limits to learn more.
#4
Cost Variation Calculation
Grounding (v5)
66.7
0
-66.7
Strict
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29868, Requested 695. Please try again in 1.126s. Visit https://platform.openai.com/account/rate-limits to learn more.
#5
Sustainability of High-Quality Growth
Grounding (v5)
66.7
0
-66.7
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 30000, Requested 561. Please try again in 1.122s. Visit https://platform.openai.com/account/rate-limits to learn more.
Run #37 · Formula v5 · Judge v6 · Benchmark v5.1 · 2026-03-22 14:26 SGT
View GPT-o3 Full Profile