Claude 3.5 Sonnet Leads SWE-bench Benchmark, Code Generation Capability Surpasses GPT-4o

Anthropic's Claude 3.5 Sonnet has achieved remarkable performance in the authoritative SWE-bench code benchmark test, successfully surpassing OpenAI's GPT-4o and demonstrating exceptional software engineering capabilities. This breakthrough marks a significant advancement for the Claude series in code generation and provides developers with a more reliable programming assistant.

Claude 3.5 Sonnet 代码生成 Anthropic
451

DeepSeek-V2 Open Source Release: 236B Parameters Run on Just 16GB VRAM, Math Capabilities Surpass Llama3, Igniting Developer Community

DeepSeek team releases open-source LLM DeepSeek-V2 with 236B parameters requiring only 16GB VRAM for inference, outperforming Meta's Llama3 in mathematical benchmarks. The model has garnered over 150,000 reposts in Chinese communities, marking a major breakthrough for domestic AI in efficient large model development.

DeepSeek-V2 开源大模型 国产AI
370