Behind GPT-o3's 8.7-Point Surge: Weekly Testing of 11 AI Models Reveals 3 Dangerous Signals
Weekly testing of 11 top AI models reveals concerning trends: stability has become a luxury, long-context capabilities are collectively declining, and Chinese models are reshaping the competitive landscape.