Anthropic's Hidden 93.9% Performance Monster: Claude Mythos Limited to Cyber Defense Sparks AI Openness Controversy

Claude Mythos, with an impressive 93.9% benchmark performance, remains locked away by Anthropic, available only to cyber defense agencies. This decision highlights fundamental challenges facing the AI industry regarding safety and openness.

When an AI model achieves a remarkable score of 93.9% in benchmark tests, it is traditionally a moment to celebrate widely. However, Anthropic has chosen a completely different path—locking Claude Mythos Preview in a vault, accessible only to cyber defense agencies. Behind this decision lies a fundamental dilemma that the AI industry is currently facing.

Extraordinary Scores and Extraordinary Decisions

According to Anthropic's official statement, Claude Mythos Preview scored 93.9% on the SWE-bench test. SWE-bench is a crucial benchmark for assessing AI's code generation and problem-solving capabilities, and this score indicates that the model is approaching the level of human professional programmers.

However, unlike other AI companies eager to showcase technological breakthroughs, Anthropic has chosen to restrict access. Reportedly, the company has explicitly stated that due to potential risks, Claude Mythos is only available to cyber defense-related agencies and will not be publicly released.

Divided Community Reactions Reveal Deep Contradictions

This decision sparked intense discussions on the X platform (formerly Twitter). Supporters see it as a model of responsible AI development, reflecting Anthropic's commitment to AI safety. Critics argue that this approach might stifle innovation, turning technological progress into the privilege of a few.

This division reflects a fundamental contradiction within the AI community: should safety or progress be prioritized? A deeper question is, who has the right to decide which technologies are "too dangerous" to be released publicly?

The Logical Chain Behind Cyber Defense Privilege

Anthropic's choice of cyber defense as the sole open area is no accident. In the current geopolitical environment, cybersecurity is a critical component of national security. An AI system capable of achieving a 93.9% code generation accuracy could serve as a defensive tool or an offensive weapon.

By restricting access, Anthropic aims to ensure that this technology is used for defense rather than attack. However, this approach raises new questions: should AI be regulated like weapons when capabilities reach a certain threshold?

The Chain Reaction of Technological Lockdown

More concerning is that Anthropic's decision could set a dangerous precedent. If every AI company begins to independently decide which technologies are "too dangerous," the entire industry might move towards fragmentation. This would not only hinder academic research and the development of the open-source community but could also exacerbate technological monopolies.

Simultaneously, this approach could have the opposite effect. History shows that technological lockdowns often spur other participants to accelerate independent research and development. If Claude Mythos is indeed so powerful, other companies and countries will undoubtedly invest more resources in catching up, potentially developing similar or even more powerful systems without safety considerations.

Redefining "Responsible AI"

Anthropic's decision forces the entire industry to rethink what "responsible AI development" means. Is it complete transparency and openness, allowing society to share both risks and benefits? Or is it leaving a few companies as gatekeepers to decide the scope of technology use?

This case reveals that AI development has entered a new phase: the pace of technological capability advancement has surpassed our ability to establish corresponding governance frameworks. In this situation, unilateral decisions by companies like Anthropic may be necessary temporary measures, but in the long term, we need broader social consensus and international cooperation.

The Claude Mythos incident is not just a product release decision; it marks a profound shift in the AI industry from "what can be done" to "what should be done." Although this shift is painful, it may be a necessary path to ensure AI technology benefits humanity rather than threatens it. However, finding a balance between safety and innovation remains an open question without a standard answer.