Claude 3.5 Sonnet (7 articles)

Claude 3.5 Sonnet Leads GPT-4o in Programming Benchmark: 49% Accuracy Ignites Developer Community

Anthropic's Claude 3.5 Sonnet achieves 49% accuracy on the SWE-bench software engineering benchmark, surpassing OpenAI's GPT-4o (33.2%) for the first time in real programming tasks. This breakthrough has sparked tens of thousands of reposts on X platform and heated discussions in the programmer community, with developers sharing real-world cases demonstrating its human-like debugging capabilities.

Claude 3.5 Sonnet Anthropic SWE-bench
530

Claude 3.5 Sonnet's Coding Capabilities Lead SWE-bench Rankings: 49% Score Surpasses GPT-4o's 33%

Anthropic's updated Claude 3.5 Sonnet model achieves a breakthrough 49% task resolution rate on the authoritative SWE-bench software engineering benchmark, significantly outperforming OpenAI's GPT-4o (33%) and other competitors. This achievement not only sets a new performance record for coding AI but has also sparked widespread discussion and praise within the global developer community.

Claude 3.5 Sonnet SWE-bench 编码AI
478

Claude 3.5 Sonnet Leads SWE-bench Benchmark, Code Generation Capability Surpasses GPT-4o

Anthropic's Claude 3.5 Sonnet has achieved remarkable performance in the authoritative SWE-bench code benchmark test, successfully surpassing OpenAI's GPT-4o and demonstrating exceptional software engineering capabilities. This breakthrough marks a significant advancement for the Claude series in code generation and provides developers with a more reliable programming assistant.