编码AI - AI News | 赢政天下

Claude 3.5 Sonnet Coding Test Exceeds 90% on SWE-bench, AI Programming Capability Approaches Human Level

Anthropic's Claude 3.5 Sonnet achieves over 90% on the SWE-bench software engineering benchmark, marking a milestone in AI coding capabilities. This breakthrough has sparked widespread discussion in the developer community and a surge in practical project implementations.

Claude 3.5 Sonnet's Coding Capabilities Lead SWE-bench Rankings: 49% Score Surpasses GPT-4o's 33%

Anthropic's updated Claude 3.5 Sonnet model achieves a breakthrough 49% task resolution rate on the authoritative SWE-bench software engineering benchmark, significantly outperforming OpenAI's GPT-4o (33%) and other competitors. This achievement not only sets a new performance record for coding AI but has also sparked widespread discussion and praise within the global developer community.

Claude 3.5 Sonnet Tops SWE-bench: 49% Accuracy Surpasses GPT-4o, Developer Productivity Enters New Revolution

Anthropic's Claude 3.5 Sonnet has achieved a breakthrough 49% accuracy on the SWE-bench coding benchmark, far exceeding GPT-4o's previous best. This milestone has ignited global developer enthusiasm, with over 50,000 related discussions on X platform in the past 24 hours.

编码AI (3 articles)

Claude 3.5 Sonnet Coding Test Exceeds 90% on SWE-bench, AI Programming Capability Approaches Human Level

Claude 3.5 Sonnet's Coding Capabilities Lead SWE-bench Rankings: 49% Score Surpasses GPT-4o's 33%

Claude 3.5 Sonnet Tops SWE-bench: 49% Accuracy Surpasses GPT-4o, Developer Productivity Enters New Revolution