ElevenLabs Launches Scribe: A Highly Accurate AI Speech-to-Text Model

4 Sources

Share

ElevenLabs, an AI startup valued at $3.3 billion, has introduced Scribe, a new speech-to-text model claiming 97% accuracy in English and support for over 99 languages, positioning itself as a strong competitor in the AI transcription market.

News article

ElevenLabs Introduces Scribe: A New Benchmark in AI Speech-to-Text Technology

ElevenLabs, an AI startup known for its audio generation capabilities, has launched Scribe, a standalone speech-to-text model that claims to set new standards in transcription accuracy. This move comes on the heels of a substantial $180 million funding round that valued the company at $3.3 billion

1

2

.

Unprecedented Accuracy Across Multiple Languages

Scribe boasts support for over 99 languages, with a word error rate of less than 5% in more than 25 languages. The model claims a 97% accuracy rate for English, while languages such as Italian have achieved an impressive 98.7% accuracy

1

2

3

. Other languages in the high-accuracy category include French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese

4

.

Outperforming Industry Giants

According to ElevenLabs' benchmarks, Scribe has outperformed notable competitors such as Google's Gemini 2.0 Flash, OpenAI's Whisper Large v3, and Deepgram Nova-3 in FLEURS and Common Voice benchmark tests across multiple languages

2

3

. This positions ElevenLabs as a formidable player in the speech-to-text market, challenging established names like Otter, Fireflies, and TurboScribe

3

.

Advanced Features for Enhanced Transcription

Scribe offers several sophisticated features that set it apart:

  1. Speaker Diarization: The ability to differentiate up to 32 speakers in a single audio file

    2

    .
  2. Word-level Timestamps: Enabling accurate subtitle generation

    1

    .
  3. Non-verbal Audio Event Detection: Identifying elements such as laughter, sound effects, and background noise

    2

    .
  4. Structured Output: Facilitating easy integration into various applications

    1

    3

    .

Pricing and Availability

ElevenLabs has priced Scribe competitively at $0.40 per hour of transcribed audio, with an introductory 50% discount for the first six weeks

1

3

. The model is accessible through the ElevenLabs website and API, allowing users to upload audio or video files for formatted transcripts

1

3

.

Future Developments and Market Impact

While Scribe currently focuses on pre-recorded audio for high-accuracy transcription, ElevenLabs has announced plans to release a low-latency version for real-time applications in the near future

2

4

. This development could significantly expand Scribe's utility in live communication scenarios and further disrupt the market.

Implications for Enterprise Users

For businesses, Scribe presents a powerful tool for scalable, high-accuracy transcription. Its multi-language support and advanced features make it particularly valuable for multinational corporations, media companies, and customer support applications

2

. The competitive pricing and API-based integration also position Scribe as an attractive option for enterprises requiring high-volume transcription services

2

.

As the AI-driven audio model market continues to evolve, ElevenLabs' launch of Scribe, alongside developments from competitors like Hume AI's Octave, signals a new era of specialized solutions for both transcription and synthetic voice applications

2

. This progression promises to enhance content production, customer engagement, and accessibility tools across various industries.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved