ElevenLabs Launches Scribe: A Highly Accurate AI Speech-to-Text Model

4 Sources

Share

ElevenLabs, an AI startup valued at $3.3 billion, has introduced Scribe, a new speech-to-text model claiming 97% accuracy in English and support for over 99 languages, positioning itself as a strong competitor in the AI transcription market.

News article

ElevenLabs Introduces Scribe: A New Benchmark in AI Speech-to-Text Technology

ElevenLabs, an AI startup known for its audio generation capabilities, has launched Scribe, a standalone speech-to-text model that claims to set new standards in transcription accuracy. This move comes on the heels of a substantial $180 million funding round that valued the company at $3.3 billion

1

2

.

Unprecedented Accuracy Across Multiple Languages

Scribe boasts support for over 99 languages, with a word error rate of less than 5% in more than 25 languages. The model claims a 97% accuracy rate for English, while languages such as Italian have achieved an impressive 98.7% accuracy

1

2

3

. Other languages in the high-accuracy category include French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese

4

.

Outperforming Industry Giants

According to ElevenLabs' benchmarks, Scribe has outperformed notable competitors such as Google's Gemini 2.0 Flash, OpenAI's Whisper Large v3, and Deepgram Nova-3 in FLEURS and Common Voice benchmark tests across multiple languages

2

3

. This positions ElevenLabs as a formidable player in the speech-to-text market, challenging established names like Otter, Fireflies, and TurboScribe

3

.

Advanced Features for Enhanced Transcription

Scribe offers several sophisticated features that set it apart:

  1. Speaker Diarization: The ability to differentiate up to 32 speakers in a single audio file

    2

    .
  2. Word-level Timestamps: Enabling accurate subtitle generation

    1

    .
  3. Non-verbal Audio Event Detection: Identifying elements such as laughter, sound effects, and background noise

    2

    .
  4. Structured Output: Facilitating easy integration into various applications

    1

    3

    .

Pricing and Availability

ElevenLabs has priced Scribe competitively at $0.40 per hour of transcribed audio, with an introductory 50% discount for the first six weeks

1

3

. The model is accessible through the ElevenLabs website and API, allowing users to upload audio or video files for formatted transcripts

1

3

.

Future Developments and Market Impact

While Scribe currently focuses on pre-recorded audio for high-accuracy transcription, ElevenLabs has announced plans to release a low-latency version for real-time applications in the near future

2

4

. This development could significantly expand Scribe's utility in live communication scenarios and further disrupt the market.

Implications for Enterprise Users

For businesses, Scribe presents a powerful tool for scalable, high-accuracy transcription. Its multi-language support and advanced features make it particularly valuable for multinational corporations, media companies, and customer support applications

2

. The competitive pricing and API-based integration also position Scribe as an attractive option for enterprises requiring high-volume transcription services

2

.

As the AI-driven audio model market continues to evolve, ElevenLabs' launch of Scribe, alongside developments from competitors like Hume AI's Octave, signals a new era of specialized solutions for both transcription and synthetic voice applications

2

. This progression promises to enhance content production, customer engagement, and accessibility tools across various industries.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo