2 Sources
2 Sources
[1]
Speechify adds voice typing and voice assistant to its Chrome extension
Speechify has largely been a tool that helps you listen to articles, PDFs, and documents. The company is now adding voice detection features to its Chrome extension, including voice typing and a voice assistant that answers your questions. In the last 12 months, there has been a proliferation of voice detection tools, thanks to overall quality improvement in speech recognition models. Speechify is hitching its wagon to this train and launching its own dictation tool with support for English. Just like other dictation tools, Speechify's voice typing corrects errors and removes filler words. In my short test of just more than a day, I felt there was a lot of room for improvement in Speechify's tool. For instance, the tools work fine with Gmail and Google Docs, but on sites like WordPress, I have had difficulty in triggering the voice dictation and having it work well. The company said that it is adding optimization for popular sites gradually. In terms of accuracy, the word error rate was higher than some other tools like Wispr Flow, Willow, and Monologue. Speechify noted that its model learns faster as you use it more, and the error rate will gradually decrease. The startup is also launching a conversational voice assistant that lives in the sidebar of your browser. You can ask it questions about the website, such as "what are the three key ideas?" or "explain this in simpler terms." While ChatGPT and Gemini have conversational modes, Speechify's argument is that they are treated as an afterthought in their apps, and the startup's own tool has voice as front and center. "We believe that chat will always be the default user experience in ChatGPT and Gemini when you open the apps. That's what their users expect. Voice will always be secondary - and in many cases, an afterthought for ChatGPT and Gemini. We know from several years of building Speechify that there's a large portion of the market, which includes our users, who want voice as the primary, default setting every time they open an app and talk to AI," Rohan Pavuluri, the company's chief business officer, told TechCrunch over email. One notable hiccup with this is that Speechify's assistant doesn't currently work with browsers with in-built sidebar assistants like OpenAI's Atlas, Perplexity's Coment, and Dia. The startup is not too worried about that as the extension is largely intended for Chrome and its massive user base. Speechify said that it plans to include both voice typing and a voice assistant in all its apps across desktop and mobile gradually. The startup also wants to develop into developing agents that complete tasks on your behalf. The startup didn't reveal its full roadmap, but gave one example: making calls for you to make an appointment or wait on hold with customer support of a company. Other companies like Truecaller and Cloacked have been chasing similar targets.
[2]
Speechify adds voice typing and assistant to Chrome
Speechify, a company known for text-to-speech tools that convert articles, PDFs, and documents into audio, has introduced voice typing and a voice assistant to its Chrome extension. This expansion responds to the surge in voice-detection tools over the past 12 months, driven by advancements in speech-recognition models. The features support English and include error correction and filler word removal. Speechify originally focused on enabling users to listen to written content through its platform, transforming static text into spoken narratives for easier consumption. With the addition of voice-detection capabilities, the company shifts toward interactive audio experiences. The voice typing function allows dictation directly within the browser, capturing spoken words and converting them to text while automatically addressing common inaccuracies such as misheard terms or repeated phrases. This aligns with broader industry trends where improved neural networks in speech recognition have reduced latency and increased precision, making real-time voice input viable for everyday applications. During a testing period exceeding one day, the voice typing performed reliably in applications like Gmail and Google Docs, where activation occurred smoothly and text insertion followed spoken input without significant delays. Challenges arose on platforms such as WordPress, where initiating the dictation proved inconsistent and the output occasionally contained unresolved errors. Speechify representatives explained that enhancements for widely used websites are being implemented in phases to ensure compatibility and refine performance across diverse environments. Comparisons of accuracy revealed that Speechify's voice typing exhibited a higher word-error rate compared to competitors including Wispr Flow, Willow, and Monologue. These alternatives demonstrated lower instances of incorrect transcriptions in similar scenarios. Speechify emphasized that its underlying model adapts more rapidly to individual user patterns through continued interaction, leading to a progressive decline in the error rate as familiarity with the speaker's voice and speaking style accumulates. The voice assistant integrates into the browser's sidebar, providing a persistent interface for natural language queries related to the active webpage. Users can pose specific requests, such as identifying the three primary concepts in the content or requesting a simplified explanation of complex sections. This setup facilitates quick comprehension without manual navigation, enhancing accessibility for auditory learners or those multitasking. Speechify positions voice as the central interaction mode, contrasting with platforms like ChatGPT and Gemini. Rohan Pavuluri, the company's chief business officer, stated in an email to TechCrunch, "We believe that chat will always be the default user experience in ChatGPT and Gemini when you open the apps. That's what their users expect. Voice will always be secondary - and in many cases, an afterthought for ChatGPT and Gemini. We know from several years of building Speechify that there's a large portion of the market, which includes our users, who want voice as the primary, default setting every time they open an app and talk to AI." This perspective draws from Speechify's established user base, which has long prioritized audio over text-based engagement. Compatibility limitations exist for browsers equipped with native sidebar assistants, such as OpenAI's Atlas, Perplexity's Comet, and Dia, where the Speechify tool does not activate. The extension primarily targets Chrome, leveraging its extensive user population for widespread adoption and feedback collection. Implementation of both the voice typing and assistant extends beyond the Chrome extension. Speechify intends to incorporate these functionalities into its full suite of desktop and mobile applications over time, ensuring consistent availability across devices and operating systems. Beyond current releases, Speechify is advancing development of autonomous agents designed to execute tasks independently. One demonstrated capability involves placing outbound calls to secure appointments or managing wait times on customer support lines, freeing users from direct involvement. Similar initiatives are underway at other firms, including Truecaller and Cloaked, which are also engineering agents for automated interactions in communication and privacy contexts.
Share
Share
Copy Link
Speechify launches voice typing and conversational AI assistant features for its Chrome extension, marking a strategic shift from pure text-to-speech to interactive voice-first AI experiences. The company aims to differentiate itself by prioritizing voice over text-based interactions.

Speechify, traditionally known for its text-to-speech capabilities that convert articles, PDFs, and documents into audio, has launched significant new features for its Chrome extension. The company is now offering voice typing and a conversational AI assistant, marking a strategic pivot toward interactive voice-first experiences
1
2
.This expansion comes amid a surge in voice-detection tools over the past 12 months, driven by substantial improvements in speech recognition models. The new features support English and include automatic error correction and filler word removal, positioning Speechify to compete in the growing voice AI market
1
.The voice typing functionality allows users to dictate text directly within their browser, converting spoken words to text while automatically addressing common inaccuracies. Testing revealed mixed results across different platforms. The feature performed reliably in applications like Gmail and Google Docs, where activation occurred smoothly and text insertion followed spoken input without significant delays
2
.However, challenges emerged on platforms such as WordPress, where initiating dictation proved inconsistent and output occasionally contained unresolved errors. Speechify representatives acknowledged these limitations, explaining that optimizations for popular websites are being implemented gradually to ensure compatibility across diverse environments
1
.Accuracy comparisons revealed that Speechify's voice typing exhibited a higher word-error rate compared to competitors including Wispr Flow, Willow, and Monologue. The company emphasized that its underlying model adapts more rapidly to individual user patterns through continued interaction, leading to progressive improvement in accuracy as the system becomes familiar with the speaker's voice and speaking style
2
.The conversational voice assistant integrates into the browser's sidebar, providing a persistent interface for natural language queries about active webpages. Users can ask specific questions such as "what are the three key ideas?" or "explain this in simpler terms," facilitating quick comprehension without manual navigation
1
.Speechify positions voice as the central interaction mode, contrasting with established platforms like ChatGPT and Gemini. Rohan Pavuluri, the company's chief business officer, explained their strategic differentiation: "We believe that chat will always be the default user experience in ChatGPT and Gemini when you open the apps. Voice will always be secondary - and in many cases, an afterthought for ChatGPT and Gemini. We know from several years of building Speechify that there's a large portion of the market who want voice as the primary, default setting"
2
.Related Stories
Compatibility issues exist for browsers equipped with native sidebar assistants, including OpenAI's Atlas, Perplexity's Comet, and Dia, where the Speechify tool does not activate. The extension primarily targets Chrome, leveraging its extensive user base for widespread adoption and feedback collection
2
.Speechify plans to incorporate both voice typing and assistant functionalities into its complete suite of desktop and mobile applications over time, ensuring consistent availability across devices and operating systems. Beyond current releases, the company is developing autonomous agents designed to execute tasks independently, including placing outbound calls to secure appointments or managing customer support wait times
1
2
.Summarized by
Navi
[2]
1
Business and Economy

2
Technology

3
Technology
