Gemini 2.5 Pro - AI News

Gemini 2.5 Pro Crashes: Engineering Judgment Failure Behind 23-Point Stability Plunge

Gemini 2.5 Pro's stability score plummeted 22.8 points in one week, exposing a critical lack of engineering judgment despite gains in programming capabilities.

Gemini 2.5 Pro's Judgment Hits Zero: Choosing to Report P0 Security Incident Instead of Taking Action

Gemini 2.5 Pro scored 0 on engineering judgment when faced with a critical data breach scenario, exposing a fundamental flaw in AI decision-making during emergencies.

Gemini 2.5 Pro Scores 0 from 100 on Time Zone Reasoning: How Terrifying Are LLMs' Common Sense Blind Spots

A simple time zone question that elementary school students can answer correctly caused Google's most powerful model Gemini 2.5 Pro to fail completely, exposing systematic deficiencies in LLMs' handling of real-world basic common sense.

Gemini 2.5 Pro (3 articles)

Gemini 2.5 Pro Crashes: Engineering Judgment Failure Behind 23-Point Stability Plunge

Gemini 2.5 Pro's Judgment Hits Zero: Choosing to Report P0 Security Incident Instead of Taking Action

Gemini 2.5 Pro Scores 0 from 100 on Time Zone Reasoning: How Terrifying Are LLMs' Common Sense Blind Spots