Cohere launches an open source voice model specifically for transcription

Mar 27, 2026 407 approx.5min TechCrunch Verified

Cohere 开源语音模型语音转录自托管AI 多语言支持

Cohere launches an open source voice model specifically for transcription

Enterprise AI company Cohere on Thursday launched its first voice model: Transcribe is an open source automatic speech recognition model that can be used for tasks like note-taking and speech analysis.
Relatively light at just 2 billion parameters, the model is meant for use with consumer-grade GPUs for those who want to self-host it. It currently supports 14 languages: English, French, German, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Chinese, Japanese, Korean, Vietnamese, and Arabic.
Cohere says Transcribe beats models such as Zoom Scribe v1, IBM Granite 4.0 1B, ElevenLabs Scribe v2, and Qwen3-ASR-1.7B Speech on <a href="https://huggingface.co/spaces/hf-audio/open_asr_leaderboard" target="_blank" rel="noreferrer noopener nofollow">the Hugging Face Open ASR leaderboard</a>, achieving an average word error rate (WER) of 5.42, lower than any other model on the benchmark.
The company claims Transcribe had an average win rate of 61% over other models when human evaluators assessed its transcriptions for accuracy, coherence, and usability. However, the model fell behind its rivals when it had to transcribe Portuguese, German, and Spanish.
Cohere says Transcribe can process 525 minutes of audio in a minute, which is high for its class of model.
The company is planning to integrate Transcribe into its enterprise agent orchestration platform, <a href="https://cohere.com/north" target="_blank" rel="noreferrer noopener nofollow">North</a>, and is making the model available through its <a href="http://dashboard.cohere.com/" target="_blank" rel="noreferrer noopener nofollow">API</a> for free. The model will also be available on <a href="https://cohere.com/solutions/model-vault" target="_blank" rel="noreferrer noopener nofollow">Model Vault</a>, Cohere’s managed inference platform.
Speech recognition models are growing increasingly popular as demand grows for note-taking and dictation apps like Granola and <a href="https://techcrunch.com/2026/02/23/wispr-flow-launches-an-android-app-for-ai-powered-dictation/">Wispr Flow</a>.
Earlier this year, Cohere reportedly <a href="https://www.cnbc.com/2026/02/13/ai-startup-cohere-revenue-ipo.html" target="_blank" rel="noreferrer noopener nofollow">told</a> investors that it was generating annual recurring revenue of $240 million in 2025, and its CEO, Aidan Gomez, was cited as saying that the startup <a href="https://techcrunch.com/2026/02/13/coheres-240m-year-sets-stage-for-ipo/">may go public “soon”</a>.

Related Articles