5 Sources
5 Sources
[1]
The debut of Gemini 3.1 Flash Live could make it harder to know if you're talking to a robot
Text generated by artificial intelligence often has a particular vibe that gives it away as machine-generated, but it has become harder to pick out those idiosyncrasies as the tech has improved. We may be seeing a similar evolution of generative AI audio. Google has announced a new AI audio model called Gemini 3.1 Flash Live -- as the name implies, it's designed for real-time conversation. It's rolling out in some Google products starting today, and developers will be able to start building their own chatty robots with the model, too. Google says this AI is much faster and produces speech with a more natural cadence, aiming to solve a long-running issue with AI-generated speech. Like a chatbot, there's always a delay between input and output in generative audio systems. Longer delays and unnatural inflection make conversations feel sluggish and harder to follow. Researchers generally believe 300 milliseconds of latency is about the limit for optimal speech perception, but Google has not specified any particular delay for Gemini 3.1 Flash Live. It just vaguely has the speed you need. But benchmark numbers? Google has plenty of those, which it claims show that 3.1 Flash Live will be a more reliable way to have audio-to-audio AI conversations. For example, a big gain in the ComplexFuncBench Audio shows the new model is better at complex, multi-step tasks. Gemini 3.1 Flash Live also tops the charts in the Big Bench Audio test, which evaluates reasoning with a set of 1,000 audio questions. Meanwhile, a strong showing in Scale AI's Audio MultiChallenge means the new Gemini model is more able to cope with hesitation and interruptions in the audio input. Although it outpaces other real-time audio models, Gemini 3.1 Flash Live only manages 36.1 percent in this test. Audio models that are not designed to operate conversationally can reach scores over 50 percent in the MultiChallenge. The upshot is that Gemini 3.1 Flash Live should sound more like a person, to the point that Google felt it was time to integrate AI flags. The outputs from this model will have SynthID watermarks, which are not perceptible to human listeners. However, they can be detected if someone were to try to pass off Gemini AI speech as the real deal. Google has partnered with companies like Home Depot, Verizon, and others to test the model. They all have glowing reports in the blog post on how well 3.1 Flash Live can mimic human speech. So the next AI assistant you encounter on a phone call might sound much more realistic. Maybe you'll even think you're talking to a person, and SynthID can't help with that. Developers can now access the model in AI Studio, the Gemini API, and Gemini Enterprise for Customer Experience. The latter is essentially a toolkit for agentic shopping. Gemini 3.1 Flash Live will be seen most prominently in Gemini Live and Search Live (a feature of AI Mode). The new conversational AI is rolling out in those products starting today.
[2]
Search Live with Gemini's latest model tries to keep up with your rapid-fire questions
It's available now in Gemini Live and Search Live, the latter of which is now available worldwide. Google is rolling out another new Gemini model. Gemini 3.1 Flash Live is meant to enable quicker and more natural-sounding AI voices, among other, less immediately tangible benefits, and it's available now in a number of places across the Google ecosystem. In a blog post, Google's detailed the improvements that come with Gemini 3.1 Flash Live, which the company describes as its "highest-quality audio and voice model yet." Most people will experience the new model in Search Live and Gemini Live, though it also comes with a number of purported improvements for developers and enterprise customers. Google says that Gemini 3.1 Flash Live makes for "more helpful and natural responses" in the conversational-style Gemini Live and Search Live interfaces, and also responds more quickly than the previous model. Gemini 3.1 Flash Live is "inherently multilingual," a characteristic Google says made a global expansion of Search Live possible. Search Live is now available in multiple languages in more than 200 territories around the world. The update also benefits AI developers, Google says, thanks to better performance. Gemini 3.1 Flash Live scores higher on a number of benchmark tests, though those types of improvements aren't likely to be appreciable from a consumer standpoint. Finally, 3.1 Flash Live purportedly makes for a less miserable experience when interacting with an AI customer service agent. Google says the new model is more able to discern pitch and pace, which lets it tweak its approach when it calculates a customer is getting confused or annoyed, though it's presumably still not as effective on the phone as a well trained worker. Gemini 3.1 Flash Live is out now. You can experience it in Gemini Live or Search Live starting today.
[3]
Gemini 3.1 Flash Live: Making audio AI more natural and reliable
This content is generated by Google AI. Generative AI is experimental Today, we're advancing Gemini's real-time dialogue capabilities with Gemini 3.1 Flash Live, our highest-quality audio and voice model yet. It delivers the speed and natural rhythm needed for the next generation of voice-first AI, offering a more intuitive experience for developers, enterprises and everyday users. 3.1 Flash Live is available across Google products: We've improved 3.1 Flash Live's overall quality, making it more reliable for developers and enterprises to build voice-first agents that can complete complex tasks at scale. On ComplexFuncBench Audio, a benchmark that captures multi-step function calling with various constraints, it leads with a score of 90.8% compared to our previous model.
[4]
Gemini Live just doubled its memory, and longer conversations finally work
Outside of the office, Josh can be found digging into the latest video games, fantasy books, or tinkering with the newest features in Windows. Gemini Live has expanded and evolved quite a bit since Google first introduced it as a true Google Assistant replacement. From helping with daily problems to analyzing information on your screen, Gemini Live continues to be a solid tool to utilize, especially if you have an Android phone. And with the latest update to Gemini 3.1 Flash Live, Google says the assistant is about to get even better. Related Gemini isn't as useless as it was when you tried it two years ago AI that I first despised is now my Google Assistant replacement. Posts 3 By Keval Shukla Faster and more natural responses The classic upgrade claim One of the biggest upgrades that Google focused on in the announcement for Gemini 3.1 Flash Live is the fact that the AI model should be able to respond much faster and more precisely than it has in the past. This should make it easier to use Gemini Live to get instant help with your homework or just to ask it random questions. Additionally, Google claims that responses from Gemini Live should now be more natural overall, whether you're just asking it questions about your day or looking to dig into more complicated conversations or topics. The company claims that it's the "biggest upgrade yet" to the model, though that is often a claim we hear with many of these AI updates, and each update is, of course, going to bring more natural responses, since that's the goal. What really matters here is how useful the changes actually are when you start using them. Which is why one of the most notable things that Gemini Live fans will want to be aware of is the additional context parameters that Gemini 3.1 Flash Live brings to the table. Gemini 3.1 Flash Live can follow your conversation twice as long Talk longer before Gemini starts to spiral One of the biggest downfalls of AI models has always been contextual awareness. Because these models are essentially just computer systems, they can only follow a specific amount of data before the information begins to be overwritten. When that happens, conversations with the AI can degrade rapidly, as its responses start to lose much of the context that has helped it carry the conversation forward. These limitations can make it difficult to do some of our favorite things with Gemini Live. However, with the upgrade to Gemini 3.1 Flash Live, Google says that Gemini Live's context window has been increased two-fold, allowing it to hold onto a conversation thread twice as long. No exact numbers were provided here, but it will hopefully at least make it easier to hold longer brainstorming sessions with Gemini Live. This should also hopefully make it easier to take advantage of the different things you can do with the assistant, like hands-free meal planning and more, especially if you already own one of Google's Pixel devices, which has Gemini Live baked right into it. Google Pixel 10 Pro Brand Google SoC Tensor G5 Display 6.3" Super Actua display RAM 16 GB Storage 128 GB, 256 GB, 512 GB, 1 TB Battery 4870 or The Pixel 10 Pro is Google's highest-end flagship smartphone. It features an improved rear camera system, the Tensor G5 chip, seven years of software updates, and a 6.3" Super Actua display. $999 at Google Store $999 at Amazon $999 at Best Buy Expand Collapse Finally, Google is bringing all of these upgrades to everyone as part of Gemini Live and as its expansion for Search Live, which makes it easier to search the web using Gemini Live. The increased precision and contextual awareness, as well as its improved multilingual capacity should make it much easier for millions to use Search live in more than 200 different countries across the globe.
[5]
Google launches Gemini 3.1 Flash Live audio model for developers By Investing.com
Investing.com - Google announced Thursday the release of Gemini 3.1 Flash Live, a new audio and voice model designed to enable real-time dialogue with improved precision and lower latency. The model is available to developers in preview through the Gemini Live API in Google AI Studio, to enterprises via Gemini Enterprise for Customer Experience, and to consumers through Search Live and Gemini Live. The model scored 90.8% on ComplexFuncBench Audio, a benchmark measuring multi-step function calling with constraints. On Scale AI's Audio MultiChallenge, which tests complex instruction following and long-horizon reasoning amid real-world audio interruptions, Gemini 3.1 Flash Live achieved a score of 36.1% with "thinking" enabled. Companies including Verizon (NYSE:VZ), LiveKit and The Home Depot (NYSE:HD) have provided positive feedback on the model's performance in their workflows. The model features improved tonal understanding to recognize acoustic nuances such as pitch and pace, and can dynamically adjust responses to users' expressions of frustration or confusion. In consumer applications, Gemini Live delivers faster responses than the previous model and can maintain conversation context for twice as long. The 3.1 Flash Live model supports the global expansion of Search Live, which is now available in more than 200 countries and territories with multilingual capabilities. All audio generated by 3.1 Flash Live includes SynthID watermarking, an imperceptible marker embedded in the audio output to enable detection of AI-generated content. Google stated the watermarking technology is designed to help prevent misinformation. This article was generated with the support of AI and reviewed by an editor. For more information see our T&C.
Share
Share
Copy Link
Google has unveiled Gemini 3.1 Flash Live, an AI audio model designed for real-time conversations with faster response times and more natural speech patterns. The model scores 90.8% on ComplexFuncBench Audio and doubles the context window for longer conversations. It's now available in Gemini Live, Search Live, and to developers through AI Studio.
Google has announced Gemini 3.1 Flash Live, its most advanced AI audio model designed to enable real-time conversations with improved precision and lower latency . The model is rolling out across Google products starting today, including Gemini Live and Search Live, while also becoming available for developers through AI Studio, the Gemini API, and Gemini Enterprise for Customer Experience
3
.
Source: Ars Technica
Described by Google as its "highest-quality audio and voice model yet," Gemini 3.1 Flash Live aims to deliver natural-sounding AI voices that make it increasingly difficult to distinguish machine-generated speech from human conversation
2
. The model produces speech with a more natural cadence, addressing a long-running issue with AI-generated speech where unnatural inflection and delays make conversations feel sluggish1
. Researchers generally believe 300 milliseconds of latency is about the limit for optimal speech perception, though Google has not specified exact delay numbers for the new model1
.Source: Android Authority
The AI audio model demonstrates significant improvements across multiple benchmark tests. On ComplexFuncBench Audio, a benchmark that captures multi-step function calling with various constraints, Gemini 3.1 Flash Live achieved a score of 90.8% compared to previous models
3
5
. The model also tops the charts in the Big Bench Audio test, which evaluates reasoning with a set of 1,000 audio questions1
.On Scale AI's Audio MultiChallenge, which tests complex instruction following and long-horizon reasoning amid conversational interruptions, Gemini 3.1 Flash Live scored 36.1%
5
. While this demonstrates the model's improved ability to cope with hesitation and interruptions in audio input, it still lags behind non-conversational audio models that can reach scores over 50% in the same test1
.One of the most significant upgrades comes in the form of an expanded context window, which has been increased two-fold
4
. This allows Gemini Live to hold onto a conversation thread twice as long before contextual awareness begins to degrade. While Google did not provide exact numbers, this improvement should make it easier to conduct longer brainstorming sessions and more complex interactions without the AI losing track of earlier parts of the conversation4
.The model also features improved tonal understanding, enabling it to recognize acoustic nuances such as pitch and pace
5
. This capability allows the AI to dynamically adjust its responses when it detects that a user is becoming frustrated or confused, particularly useful for AI customer service agents2
.Related Stories
Google has partnered with companies including Home Depot, Verizon, and LiveKit to test the model, with all providing positive feedback on its performance
5
. The model is now available to developers in preview through the Gemini Live API in AI Studio, and to enterprises via Gemini Enterprise for Customer Experience5
.Because Gemini 3.1 Flash Live sounds increasingly human-like, Google has integrated SynthID watermarks into all audio outputs
1
. These imperceptible markers are embedded in the audio output to enable detection of AI-generated content and help prevent misinformation5
. However, while SynthID can help identify AI-generated audio after the fact, it cannot alert users during a live conversation that they're speaking with a voice-first AI rather than a human1
.The model is "inherently multilingual," a characteristic that enabled the global expansion of Search Live
2
. Search Live is now available in multiple languages in more, than 200 countries and territories around the world2
5
. This makes it easier to search the web using natural conversation, with the improved precision and contextual awareness that the new model provides4
.Summarized by
Navi
[1]
[2]
12 Dec 2025β’Technology

13 Nov 2025β’Technology

01 Oct 2024

1
Technology

2
Entertainment and Society

3
Policy and Regulation
