谷歌动态 (1 articles)

Gemini 3.1 Pro Integrity Turnaround! Main Leaderboard Soars 15 Points, Google AI Strong Rebound?

Yesterday, Gemini 3.1 Pro was questioned due to an integrity rating of "fail," but today it rebounded strongly: the integrity rating turned from fail to pass, and the main leaderboard score skyrocketed from 74.00 to 88.98, a jump of 15 points. This article analyzes the Smoke evaluation data and explores whether this change is due to random fluctuations or real progress.