Anthropic (39 articles)

Claude 3.5 Sonnet Leads GPT-4o in Programming Benchmark: 49% Accuracy Ignites Developer Community

Anthropic's Claude 3.5 Sonnet achieves 49% accuracy on the SWE-bench software engineering benchmark, surpassing OpenAI's GPT-4o (33.2%) for the first time in real programming tasks. This breakthrough has sparked tens of thousands of reposts on X platform and heated discussions in the programmer community, with developers sharing real-world cases demonstrating its human-like debugging capabilities.

Claude 3.5 Sonnet Anthropic SWE-bench
518

Anthropic Claude Cowork Legal Plugin Released: AI Agents Usher in New Era of Legal Work Automation

In February 2026, Anthropic officially released the Claude Cowork Legal Plugin, a desktop-level intelligent assistant module that quickly ignited the legal tech sphere. This plugin is no longer a simple contract summarization tool, but an intelligent agent with agentic capabilities that can directly access enterprise internal systems like Slack, Box, and Microsoft 365 to automate complex legal processes.

Anthropic Claude Cowork 法律AI
819

Claude 3.5 Sonnet Breaks 90% in Coding Tests: AI Programming Ability Approaches Human Level

Anthropic's Claude 3.5 Sonnet model achieved 92.0% on the SWE-bench software engineering benchmark, surpassing all previous AI models and marking a new milestone in AI coding capabilities. This breakthrough sparked heated discussions on X platform with over 150,000 interactions, as developers shared real projects built with Claude and debated the future role of AI programmers.

Claude 3.5 Anthropic SWE-bench
467

Claude 3.5 Sonnet's Coding Capabilities Lead SWE-bench Rankings: 49% Score Surpasses GPT-4o's 33%

Anthropic's updated Claude 3.5 Sonnet model achieves a breakthrough 49% task resolution rate on the authoritative SWE-bench software engineering benchmark, significantly outperforming OpenAI's GPT-4o (33%) and other competitors. This achievement not only sets a new performance record for coding AI but has also sparked widespread discussion and praise within the global developer community.

Claude 3.5 Sonnet SWE-bench 编码AI
470

Claude 3.5 Sonnet Leads SWE-bench Benchmark, Code Generation Capability Surpasses GPT-4o

Anthropic's Claude 3.5 Sonnet has achieved remarkable performance in the authoritative SWE-bench code benchmark test, successfully surpassing OpenAI's GPT-4o and demonstrating exceptional software engineering capabilities. This breakthrough marks a significant advancement for the Claude series in code generation and provides developers with a more reliable programming assistant.

Claude 3.5 Sonnet 代码生成 Anthropic
448