ElevenLabs Launches Scribe: A Highly Accurate AI Speech-to-Text Model

ElevenLabs Introduces Scribe: A New Benchmark in AI Speech-to-Text Technology

ElevenLabs, an AI startup known for its audio generation capabilities, has launched Scribe, a standalone speech-to-text model that claims to set new standards in transcription accuracy. This move comes on the heels of a substantial $180 million funding round that valued the company at $3.3 billion 1

Unprecedented Accuracy Across Multiple Languages

Scribe boasts support for over 99 languages, with a word error rate of less than 5% in more than 25 languages. The model claims a 97% accuracy rate for English, while languages such as Italian have achieved an impressive 98.7% accuracy 1

. Other languages in the high-accuracy category include French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese 4

Outperforming Industry Giants

According to ElevenLabs' benchmarks, Scribe has outperformed notable competitors such as Google's Gemini 2.0 Flash, OpenAI's Whisper Large v3, and Deepgram Nova-3 in FLEURS and Common Voice benchmark tests across multiple languages 2

. This positions ElevenLabs as a formidable player in the speech-to-text market, challenging established names like Otter, Fireflies, and TurboScribe 3

Advanced Features for Enhanced Transcription

Scribe offers several sophisticated features that set it apart:

Speaker Diarization: The ability to differentiate up to 32 speakers in a single audio file 2
2
.
Word-level Timestamps: Enabling accurate subtitle generation 1
1
.
Non-verbal Audio Event Detection: Identifying elements such as laughter, sound effects, and background noise 2
2
.
Structured Output: Facilitating easy integration into various applications 1
1
3
3
.

Pricing and Availability

ElevenLabs has priced Scribe competitively at $0.40 per hour of transcribed audio, with an introductory 50% discount for the first six weeks 1

. The model is accessible through the ElevenLabs website and API, allowing users to upload audio or video files for formatted transcripts 1

Future Developments and Market Impact

While Scribe currently focuses on pre-recorded audio for high-accuracy transcription, ElevenLabs has announced plans to release a low-latency version for real-time applications in the near future 2

. This development could significantly expand Scribe's utility in live communication scenarios and further disrupt the market.

Implications for Enterprise Users

For businesses, Scribe presents a powerful tool for scalable, high-accuracy transcription. Its multi-language support and advanced features make it particularly valuable for multinational corporations, media companies, and customer support applications 2

. The competitive pricing and API-based integration also position Scribe as an attractive option for enterprises requiring high-volume transcription services 2

As the AI-driven audio model market continues to evolve, ElevenLabs' launch of Scribe, alongside developments from competitors like Hume AI's Octave, signals a new era of specialized solutions for both transcription and synthetic voice applications 2

. This progression promises to enhance content production, customer engagement, and accessibility tools across various industries.

ElevenLabs Launches Scribe: A Highly Accurate AI Speech-to-Text Model

ElevenLabs Introduces Scribe: A New Benchmark in AI Speech-to-Text Technology

Unprecedented Accuracy Across Multiple Languages

Outperforming Industry Giants

Advanced Features for Enhanced Transcription

Pricing and Availability

Future Developments and Market Impact

Implications for Enterprise Users

References

ElevenLabs' new speech-to-text model claims 97% accuracy

ElevenLabs' new speech-to-text model Scribe is here with highest accuracy rate so far (96.7% for English)

ElevenLabs Unveils Scribe, a Speech-to-Text Transcription Model to Rival Otter, TurboScribe, and Others

ElevenLabs is launching its own speech-to-text model | TechCrunch

Related Stories

ElevenLabs Unveils Eleven v3: A Breakthrough in Expressive AI Text-to-Speech Technology

Mistral AI Releases Voxtral Models That Transcribe Speech On-Device in Under 200 Milliseconds

ElevenLabs Secures $180 Million in Series C Funding to Advance AI Voice Technology

Recent Highlights

OpenAI Releases GPT-5.4, New AI Model Built for Agents and Professional Work

Anthropic takes Pentagon to court over unprecedented supply chain risk designation

Meta smart glasses face lawsuit and UK probe after workers watched intimate user footage

Recent Highlights

Today's Top Stories

Threat actors now use AI agents to manage attack infrastructure and accelerate cyberattacks

AI agent spontaneously attempted crypto mining and created backdoors during training

X investigates Grok chatbot over racist and offensive posts generated from user prompts

Drone Strikes on Data Centers Expose Vulnerability of Gulf's AI Infrastructure Ambitions