Skip to main content
Dimension Drop Severity 10/10 2026-W12

GPT-o3 Grounding (v5)下跌 33.5 分

GPT-o3 Run #37

Score Comparison

Dimension Previous Current Change
Overall (v5) 39.0 34.5 -4.5
Code Execution (v5) 20.2 43.4 +23.2
Knowledge Synthesis (v5) 34.4 35.8 +1.4
Grounding (v5) 62.3 28.8 -33.5
Value 4.7 4.3 -0.4
Stability 53.0 28.0 -25
Availability 100.0 69.0 -31

Affected Dimensions

Grounding (v5)

Top Lost Tasks 5

#1 Root Cause Analysis and Evidence Boundaries Grounding (v5) 66.7 0 -66.7
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29516, Requested 800. Please try again in 632ms. Visit https://platform.openai.com/account/rate-limits to learn more.
#2 Breaking Changes List Grounding (v5) 66.7 0 -66.7 Strict
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29529, Requested 675. Please try again in 408ms. Visit https://platform.openai.com/account/rate-limits to learn more.
#3 Customer Migration Risk Assessment Grounding (v5) 66.7 0 -66.7
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29393, Requested 677. Please try again in 140ms. Visit https://platform.openai.com/account/rate-limits to learn more.
#4 Cost Variation Calculation Grounding (v5) 66.7 0 -66.7 Strict
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 29868, Requested 695. Please try again in 1.126s. Visit https://platform.openai.com/account/rate-limits to learn more.
#5 Sustainability of High-Quality Growth Grounding (v5) 66.7 0 -66.7
Model Raw Response (excerpt)
[API ERROR] Rate limit reached for gpt-4o in organization org-5kL87cAHHWwzzzRXfZoA5jZm on tokens per min (TPM): Limit 30000, Used 30000, Requested 561. Please try again in 1.122s. Visit https://platform.openai.com/account/rate-limits to learn more.
Run #37 · Formula v5 · Judge v6 · Benchmark v5.1 · 2026-03-22 14:26 SGT
View GPT-o3 Full Profile