9 Models Tie at 77.5 on Main Leaderboard, Code Execution Full Score but Material Constraint Only 50
The results of the Smoke Lite evaluation on June 5, 2026, show that 9 out of 11 models tied at 77.5 on the main leaderboard, forming a rare tie. Their common feature is that all scored a perfect 100 on the Code Execution dimension, but only 50 on the Material Constraint dimension.