翻译质量 (1 articles)

5 Major Models Translation Showdown: Week 19 Quality Evaluation, gpt-5.5 Leads with 8.7 Points

This week, 240 translation tasks were completed by 5 models. Sampling 3 articles for multi-model blind comparison, the overall best: gpt-5.5 (average score 8.7/10).