2 Sources
[1]
Hume's new EVI 3 model lets you customize AI voices - how to try it
The company is betting that the future of AI will belong to models that can speak and emote in humanlike voices. Hume AI is launching EVI 3, the third iteration of its Empathic Voice Interface (EVI) model, which can interact with users in a huge variety of humanlike voices. Like ChatGPT's voice mode, EVI 3 comes with an assortment of preprogrammed AI voices. These are listed by personality and character descriptions, including "Old Knocks Comedian," "Seasoned Life Coach," "Wise Wizard," and "Dungeon Master," as well as the company's namesake, the 18th-century philosopher David Hume. Crucially, the model also comes with a feature that allows users to customize their own AI voices from scratch. And rather than having to adjust a long list of specific attributes, as you might when building a Bitmoji or a video game character, you can simply describe the characteristics of your desired voice, using natural language, and the model will do the rest. The launch reflects a broader effort among AI companies to build more personable and engaging models by training them to exhibit distinct "personalities." Anthropic's Claude was trained to be thoughtful and open-minded, for example, while xAI's Grok is supposed to be edgier, with a sense of humor. Hume describes itself on its website as working "to ensure that artificial intelligence is built to serve human goals and emotional well-being." That mission statement is reminiscent of those of some of the most preeminent AI developers (OpenAI, for example, is aiming "to ensure that artificial general intelligence...benefits all of humanity"). But whereas the bigger players are mainly oriented around building bigger and more powerful models, Hume seems primarily focused on fine-tuning the believability of its models, so that they can verbally communicate in a way that not only sounds, but feels real, down to the little pauses between words and the occasional "umm" peppered into sentences. Also: What is AI? Everything to know about artificial intelligence The results are impressive. My first time demoing the model, I asked it to generate a character that spoke in a world-weary but witty working-class British accent -- à la Michael Caine -- and who was a staunch Flat-Earther. When the voice was ready, I asked it why it thought the government and scientists were lying about the shape of the Earth, and it immediately launched into a passionate tirade about why the real logical fallacy was believing an official narrative when all of the direct evidence from one's senses pointed to the opposite story being true (i.e., the Earth is a flat disc). The voice was lyrical and full of energy, as if we were speaking at some Olde English pub. Also: AI voice generators: What they can do and how they work In a company blog post published Thursday, Hume wrote that the launch of EVI 3 marks the next step in the company's mission to "achieve a voice AI experience that can be fully personalized" by the end of this year. "We believe this is an essential step toward voice being the primary way people want to interact with AI." In 1950, the mathematician Alan Turing proposed his famous test for assessing machine intelligence. The "Imitation Game," as he called it -- now known as the Turing Test -- envisioned a human being interviewing another human and a machine, both of which were hidden behind a partition. If the interlocutor couldn't tell which responses were coming from the human and which were coming from the machine, the latter had passed the test and could be considered true artificial intelligence. Seventy-five years later, we have AI tools that can not only write, but actually speak in a way that seems convincingly human. Many of the latest voice-equipped AI models have none of the mechanical monotone or emotional vacancy characteristic of earlier automated voices, like the ones that greet you when you call your bank. They instead exhibit a broad range of tenors and personalities, encapsulating what's effectively become an entire subfield of AI research in and of itself, sparked by a competition among tech companies to build more personable and engaging software. The question of how the average person will interact with AI in the future has been a growing concern across Silicon Valley in recent years, as companies have searched for viable successors to chatbots like ChatGPT. OpenAI recently announced a plan to buy io, a company founded by former Apple executive Jony Ive (the designer of the iPhone), with long-term plans to build hardware centered on AI. A similar goal was undertaken by the company Humane with its AI Pin, before that product flopped. Hume is banking on the idea that the future of AI will belong to models that can speak with users in humanlike voices. When developing EVI 3, Hume compared its performance to some of the most powerful AI voice assistant models currently available, including GPT-4o and Gemini Live, across a few key benchmarks. Also: What is Gemini? Everything you should know about Google's new AI model According to the company blog post, EVI 3 outperformed its competitors in "emotion/style modulation, or adjusting its emotional tone throughout the course of a conversation. It also outperformed GPT-4o in "emotion understanding" -- an ability to recognize and interpret the emotional tenor of users' voices. Finally, early testing showed that it has a lower latency than both GPT-4o and Gemini Live -- though it was outscored by the chatbot from AI company Sesame. You can try EVI 3 today through a demo and Hume's iOS app. Hume hasn't announced pricing for the model just yet. An API is slated for release in the coming weeks. The model currently specializes in English but will become proficient in other major languages, including French and Spanish, as it continues to be trained and after it's generally released, according to the company blog post.
[2]
Emotive voice AI startup Hume launches new EVI 3 model with rapid custom voice creation
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More New York-based AI startup Hume has unveiled its latest Empathic Voice Interface (EVI) conversational AI model, EVI 3 (pronounced "Evee" Three, like the Pokémon character), targeting everything from powering customer support systems and health coaching to immersive storytelling and virtual companionship. EVI 3 lets users create their own voices by talking to the model (it's voice-to-voice/speech-to-speech), and aims to set a new standard for naturalness, expressiveness, and "empathy" according to Hume -- that is, how users perceive the model's understanding of their emotions and its ability to mirror or adjust its own responses, in terms of tone and word choice. Designed for businesses, developers, and creators, EVI 3 expands on Hume's previous voice models by offering more sophisticated customization, faster responses, and enhanced emotional understanding. Individual users can interact with it today through Hume's live demo on its website and iOS app, but developer access through Hume's proprietary application programming interface (API) is said to be made available in "the coming weeks," as a blog post from the company states. At that point, developers will be able to embed EVI 3 into their own customer service systems, creative projects, or virtual assistants -- for a price (see below). My own usage of the demo allowed me to create a new, custom synthetic voice in seconds based on qualities I described to it -- a mix of warm and confident, and a masculine tone. Speaking to it felt more naturalistic and easy than other AI models and certainly the stock voices from legacy tech leaders such Apple with Siri and Amazon with Alexa. What developers and businesses should know about EVI 3 Hume's EVI 3 is built for a range of uses -- from customer service and in-app interactions to content creation in audiobooks and gaming. It allows users to specify precise personality traits, vocal qualities, emotional tone, and conversation topics. This means it can produce anything from a warm, empathetic guide to a quirky, mischievous narrator -- down to requests like "a squeaky mouse whispering urgently in a French accent about its scheme to steal cheese from the kitchen." EVI 3's core strength lies in its ability to integrate emotional intelligence directly into voice-based experiences. Unlike traditional chatbots or voice assistants that rely heavily on scripted or text-based interactions, EVI 3 adapts to how people naturally speak -- picking up on pitch, prosody, pauses, and vocal bursts to create more engaging, humanlike conversations. However, one big feature Hume's models currently lack -- and which is offered by rivals open source and proprietary, such as ElevenLabs -- is voice cloning, or the rapid replication of a user's or other voice, such as a company CEO. Yet Hume has indicated it will add such a capability to its Octave text-to-speech model, as it is noted as "coming soon" on Hume's website, and prior reporting by yours truly on the company found it will allow users to replicate voices from as little as five seconds of audio. Hume has stated it's prioritizing safeguards and ethical considerations before making this feature broadly available. Currently, this cloning capability is not available in EVI itself, with Hume emphasizing flexible voice customization instead. Internal benchmarks show users prefer EVI 3 to OpenAI's GPT-4o voice model According to Hume's own tests with 1,720 users, EVI 3 was preferred over OpenAI's GPT-4o in every category evaluated: naturalness, expressiveness, empathy, interruption handling, response speed, audio quality, voice emotion/style modulation on request, and emotion understanding on request (the "on request" features are covered in "instruction following" seen below). It also usually bested Google's Gemini model family and the new open source AI model firm Sesame from former Oculus co-creator Brendan Iribe. It also boasts lower latency (~300 milliseconds), robust multilingual support (English and Spanish, with more languages coming), and effectively unlimited custom voices. As Hume writes on its website (see screenshot immediately below): Key capabilities include: Pricing and developer access Hume offers flexible, usage-based pricing across its EVI, Octave TTS, and Expression Measurement APIs. While EVI 3's specific API pricing has not been announced yet (marked as TBA), the pattern suggests it will be usage-based, with enterprise discounts available for large deployments. For reference, EVI 2 is priced at $0.072 per minute -- 30% lower than its predecessor, EVI 1 ($0.102/minute). For creators and developers working with text-to-speech projects, Hume's Octave TTS plans range from a free tier (10,000 characters of speech, ~10 minutes of audio) to enterprise-level plans. Here's the breakdown: For developers working on real-time voice interactions or emotional analysis, Hume also offers a Pay as You Go plan with $20 in free credits and no upfront commitment. High-volume enterprise customers can opt for a dedicated Enterprise plan featuring dataset licenses, on-premises solutions, custom integrations, and advanced support. Hume's history of emotive AI voice models Founded in 2021 by Alan Cowen, a former researcher at Google DeepMind, Hume aims to bridge the gap between human emotional nuance and AI interaction. The company trained its models on an expansive dataset drawn from hundreds of thousands of participants worldwide -- capturing not just speech and text, but also vocal bursts and facial expressions. "Emotional intelligence includes the ability to infer intentions and preferences from behavior. That's the very core of what AI interfaces are trying to achieve," Cowen told VentureBeat. Hume's mission is to make AI interfaces more responsive, humanlike, and ultimately more useful -- whether that's helping a customer navigate an app or narrating a story with just the right blend of drama and humor. In early 2024, the company launched EVI 2, which offered 40% lower latency and 30% reduced pricing compared to EVI 1, alongside new features like dynamic voice customization and in-conversation style prompts. February 2025 saw the debut of Octave, a text-to-speech engine for content creators capable of adjusting emotions at the sentence level with text prompts. With EVI 3 now available for hands-on exploration and full API access just around the corner, Hume hopes to allow developers and creators to reimagine what's possible with voice AI.
Share
Copy Link
Hume AI launches EVI 3, its latest Empathic Voice Interface model, offering humanlike voices and customization options. The model aims to revolutionize AI-human interactions through emotionally intelligent and personalized voice experiences.
Hume AI, a New York-based startup, has unveiled EVI 3, the latest iteration of its Empathic Voice Interface (EVI) model. This advanced AI voice technology represents a significant step forward in creating more natural and emotionally intelligent AI-human interactions 12.
EVI 3 stands out for its ability to generate a wide array of humanlike voices with distinct personalities. Users can choose from pre-programmed options like "Old Knocks Comedian" or "Wise Wizard," or create custom voices using natural language descriptions 1. The model excels in:
According to Hume's internal tests involving 1,720 users, EVI 3 outperformed OpenAI's GPT-4o in several categories, including naturalness, expressiveness, and empathy. It also showed advantages over Google's Gemini model family in various aspects 2.
Source: VentureBeat
EVI 3 is designed for a wide range of applications, including:
The model's flexibility allows it to produce voices ranging from empathetic guides to quirky narrators, catering to diverse user needs 2.
Currently, individual users can interact with EVI 3 through Hume's live demo and iOS app. Developer access via API is expected in the coming weeks, enabling integration into various projects and systems 2. While pricing details for EVI 3 are yet to be announced, Hume offers flexible, usage-based pricing across its product range 2.
Source: ZDNet
EVI 3's launch reflects a growing trend in AI development, focusing on creating more personable and engaging models. This aligns with efforts by major tech companies to build AI with distinct personalities, such as Anthropic's Claude and xAI's Grok 1.
Founded in 2021 by Alan Cowen, a former Google DeepMind researcher, Hume aims to bridge the gap between human emotional nuance and AI interaction. The company's models are trained on an extensive dataset capturing speech, text, vocal bursts, and facial expressions from a diverse global participant pool 2.
While EVI 3 represents a significant advancement, it currently lacks voice cloning capabilities offered by some competitors. Hume has indicated plans to add this feature to its Octave text-to-speech model, emphasizing the need for ethical considerations and safeguards 2.
As AI voice technology continues to evolve, EVI 3 sets a new benchmark in creating more natural, expressive, and emotionally intelligent AI-human interactions, potentially reshaping how we engage with AI in various aspects of our lives.
Yoshua Bengio, a renowned AI researcher, has launched LawZero, a non-profit organization aimed at developing safer AI systems. The initiative comes in response to growing concerns about the potential risks of advanced AI models.
7 Sources
Science and Research
20 hrs ago
7 Sources
Science and Research
20 hrs ago
Meta has signed a multi-billion dollar agreement with Constellation Energy to keep the Clinton Clean Energy Center nuclear power plant operational, securing a long-term clean energy source for its growing AI and data center needs.
35 Sources
Business and Economy
12 hrs ago
35 Sources
Business and Economy
12 hrs ago
Snowflake launches Openflow, a new platform designed to streamline data integration and management for businesses in the age of AI, offering enhanced interoperability and simplified data pipelines.
4 Sources
Technology
12 hrs ago
4 Sources
Technology
12 hrs ago
AI-powered code generation startups are attracting massive investments and valuations, transforming the software development landscape. However, these startups face challenges including profitability concerns and competition from tech giants.
7 Sources
Technology
12 hrs ago
7 Sources
Technology
12 hrs ago
TSMC CEO C.C. Wei addresses the impact of US tariffs on the semiconductor industry, highlighting robust AI demand that continues to outpace supply. The company also denies rumors of expanding operations to the Middle East.
8 Sources
Business and Economy
20 hrs ago
8 Sources
Business and Economy
20 hrs ago