GPT-4o Crashes: The Strict Mode Trap Behind a 35-Point Plunge
GPT-4o experiences a catastrophic performance collapse with its usability score plummeting from 100 to 65, caused by overly conservative "strict tool calling" that makes the model refuse to perform basic tasks.