Skip to main content
Dimension Drop Severity 10/10 2026-W12

DeepSeek R1 Stability下跌 22.1 分

DeepSeek R1 Run #37

Score Comparison

Dimension Previous Current Change
Overall (v5) 49.0 65.8 +16.8
Code Execution (v5) 20.5 67.9 +47.4
Knowledge Synthesis (v5) 36.4 42.9 +6.5
Grounding (v5) 60.2 78.3 +18.1
Value 69.4 88.1 +18.7
Stability 53.7 31.6 -22.1
Availability 100.0 100.0 +0

Affected Dimensions

Stability
Run #37 · Formula v5 · Judge v6 · Benchmark v5.1 · 2026-03-22 14:26 SGT
View DeepSeek R1 Full Profile