Google Enhances Gemini Live with Visual Guidance and Native Audio Model

Reviewed byNidhi Govil

8 Sources

Share

Google unveils significant upgrades to Gemini Live, including visual guidance capabilities and a new native audio model, enhancing its AI assistant's ability to interact more naturally and helpfully with users.

Google Unveils Major Upgrades to Gemini Live

Google has announced significant enhancements to its AI assistant, Gemini Live, introducing new features that promise to make interactions more natural, visual, and helpful. These upgrades, unveiled alongside the launch of the Pixel 10 series, represent a substantial leap forward in AI-assisted daily tasks and communication

1

2

.

Visual Guidance: Seeing the World Through AI Eyes

One of the most notable additions to Gemini Live is the visual guidance feature. This allows the AI to provide real-time visual cues by highlighting objects directly on the user's screen when the camera is shared. Google describes this as a "powerful new way to learn and solve problems together"

2

5

.

Source: 9to5Google

Source: 9to5Google

The practical applications of this feature are diverse:

  • Fashion assistance: Gemini can highlight the best sneaker choice to match an outfit

    2

    .
  • DIY help: When faced with a toolbox, Gemini can point out the correct tool for a specific task

    2

    3

    .
  • Assembly guidance: Users can show parts to the camera, and Gemini will visually indicate how they fit together

    3

    .

This feature is launching first on the Pixel 10 series, with plans to expand to other Android devices and iOS in the coming weeks

2

.

Native Audio Model: A More Expressive AI

Gemini Live is set to become more attuned to users' emotions and context through a new native audio model. This upgrade allows the AI to detect nuances in tone of voice, such as excitement or concern, and adjust its responses accordingly

1

2

.

Key improvements include:

  • Enhanced use of intonation, rhythm, and pitch in speech

    2

    .
  • Ability to match the user's mood for more engaging conversations

    1

    .
  • Contextual voice adjustments, such as using a calmer tone for stressful topics or speeding up when the user is in a hurry

    3

    .

Expanded App Integrations

Google is deepening Gemini Live's integration with its app ecosystem, allowing for more seamless multitasking and task management

2

4

. New integrations include:

  • Calendar, Keep, and Tasks: Users can add notes or schedule meetings through voice commands

    4

    .
  • Messages, Phone, and Clock: Upcoming additions for communication management

    2

    4

    .
  • Enhanced Google Maps integration: Additional capabilities to be added soon

    2

    .

These integrations aim to create a more ambient and always-on AI platform, reducing the need to switch between apps for various tasks

3

4

.

Availability and Future Plans

The visual guidance feature is initially exclusive to the Pixel 10 series but will roll out to other Android devices and iOS in the near future

2

3

. Google is also offering one year of Google AI Pro access with the purchase of Pixel 10 Pro models, providing broader access to Gemini app features like video generation

2

.

For wearable users, the Pixel Watch 4 introduces a raise-to-talk feature for Gemini, allowing for even more convenient access to the AI assistant

2

.

As these updates roll out, Google continues to position Gemini as a universal AI assistant, capable of understanding the world around users and assisting with a wide range of tasks. The company's focus on natural interactions and seamless integration with daily life underscores its commitment to making AI technology more accessible and useful for everyday users

5

.

Source: NDTV Gadgets 360

Source: NDTV Gadgets 360

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo