Alibaba's Qwen Accused by Anthropic of Distilling Claude Outputs: Model Distillation Ethics Controversy Escalates

Anthropic has publicly accused Alibaba's Qwen lab of systematically using millions of Claude API queries to extract reasoning trajectories for model distillation, violating its service terms and sparking debates on AI training ethics, intellectual property protection, and fair competition between open-source and closed-source models.

Recently, a high-profile controversy erupted in the AI field. U.S. AI company Anthropic publicly accused Alibaba's Qwen lab of allegedly conducting model distillation by massively calling the Claude API to extract reasoning trajectories for training a competing model. This action is said to violate Anthropic's service terms, sparking heated discussions in the industry about AI training ethics, intellectual property protection, and fair competition between open-source and closed-source models.

According to Anthropic, the Qwen team may have used millions of API queries to systematically collect Claude's intermediate steps and output trajectories on complex reasoning tasks. This data was then used for distillation training, aiming to enable the Qwen model to quickly catch up with Claude in areas such as mathematics, programming, and logical reasoning. The accusation quickly spread on social platform X, with related posts garnering over a thousand likes.

Model distillation, as a common technique, allows developers to train smaller, more efficient models using the outputs of larger models to reduce inference costs. However, when the distillation target involves commercial APIs and its core reasoning capabilities are extracted at scale, boundary issues become prominent. Anthropic emphasized in its statement that such behavior not only harms its commercial interests but could also undermine the entire industry's trust in API services.

Alibaba has yet to make an official response to this. However, the rapid iteration of the Qwen series models in the open-source community has long been regarded as an important representative of China's AI capabilities. Earlier versions such as Qwen2 have approached or surpassed some international closed-source models on multiple benchmarks, placing their training methods under scrutiny.

This incident has pushed the legality and morality of AI distillation technology to the forefront. Supporters believe that open-source models accelerating their catch-up through legitimate data distillation contributes to technology democratization; critics worry that without clear rules, API providers' intellectual property rights will be difficult to protect, thereby suppressing innovation investment.

Industry analysts point out that similar controversies may prompt API providers to strengthen monitoring mechanisms, such as limiting query frequency, adding watermarks, or adjusting service terms. At the same time, regulators may also step in to formulate new norms for AI data usage.

Regardless of the final outcome, this accusation has prompted AI developers worldwide to re-examine the ethical issues surrounding the boundaries of distillation. In the intense competition of open-source catching up with closed-source, how to balance efficiency and fairness will remain a long-term challenge.