Claude Opus 4.6
Change Analysis · 2026 Week12
Claude Opus 4.6 2026 Week12 Code Execution (v5) dimension rose 42 pts
Score Comparison
40.3
51.3
+11
| Dimension | Previous | Current | Change |
|---|---|---|---|
| Code Execution (v5) | 20.2 | 62.2 | +42 |
| Knowledge Synthesis (v5) | 37.8 | 43.3 | +5.5 |
| Grounding (v5) | 66.7 | 74.6 | +7.9 |
| Value | 2.8 | 4 | +1.2 |
| Stability | 53.5 | 31 | -22.5 |
| Availability | 100 | 100 | 0 |
All matched tasks had no score changes, or no tasks could be matched to the previous evaluation.
Back to Movers