Claude 3.5 Sonnet Tops SWE-bench Coding Benchmark: 72.7% Score Leads AI Programming Track
Anthropic's Claude 3.5 Sonnet achieved a groundbreaking 72.7% score on the SWE-bench software engineering benchmark, becoming the first AI model to exceed 70% and surpassing competitors like GPT-4o and Gemini 1.5 Pro, marking a new era in AI-assisted programming.