Cohere Open-sources Command A+ 218B MoE Model to Reshape Enterprise Sovereign AI

Cohere has open-sourced Command A+, a 218B-parameter sparse MoE model with 25B active parameters and 128K context length, offering significant performance gains over previous versions and competing models under the permissive Apache 2.0 license, enabling sovereign AI deployment at single-node scale.

Technical Specifications and Architecture Innovation

Command A+ has a total of 218B parameters with only 25B active, adopting a sparse MoE design. The input context reaches 128K, with a maximum generation length of 64K, supporting both text and image modalities along with tool calling. The official W4A4 quantized version runs on a single B200 or dual H100 GPUs, offering significantly higher inference efficiency compared to dense models of similar scale.

Direct Comparison with Competing MoE Models

DeepSeek-V2 has a total of 236B parameters with 21B active; Command A+ outperforms it by 12 percentage points on agentic coding tasks. Llama 3.1 405B is a dense model requiring full parameter activation per forward pass; Command A+ achieves 2.3 times higher throughput on the same H100 cluster. Mistral Large uses 8 experts, while Command A+ uses more experts and supports 48 languages, far exceeding Mistral's 12-language support.

Quantitative Improvement in Agentic Capabilities

On τ²-Bench, the telecom scenario score rose from 37% with Command A Reasoning to 85%; on Terminal-Bench Hard, from 3% to 25%. Multimodal document processing and long-horizon reasoning capabilities have been simultaneously enhanced, with a single model integrating all the sub-model functions of the previous Command A series.

A North American telecom operator has already replaced its original hybrid model stack with Command A+ within the North workspace, reducing deployment costs by 41%.

Commercial Value of the Apache 2.0 License

Apache 2.0 allows enterprises to modify, commercially use, and closed-source derivative works without releasing the modified source code. Developers can directly integrate the model into proprietary SaaS products, avoiding GPL copyleft risks. Compared to Llama series custom licenses, this license offers higher legal certainty, facilitating risk management reviews and investment/funding rounds.

  • Enterprises can embed the model into their own hardware devices without paying fees to Cohere
  • Support for mainstream frameworks such as vLLM and Transformers, with near-zero migration cost
  • 48-language coverage provides a foundation for cross-border compliant deployment

Path to Sovereign AI Implementation

The model weights have been uploaded to Hugging Face with multiple lossless quantized versions available. The Model Vault hosting solution meets the needs of organizations that prefer not to build their own inference clusters. Together, these form a complete loop from experimentation to production, lowering the threshold for sovereign AI to the single-node level.