Google's AI Mode Search Evolves: Now Answering Questions About Images

Google Enhances AI Mode with Multimodal Capabilities

Google has taken a significant leap forward in its AI-powered search capabilities by introducing multimodal functionality to its AI Mode search feature. This update allows users to ask complex questions about images, combining the power of Google Lens with a custom version of the Gemini large language model (LLM) 1

How It Works

The new multimodal AI Mode enables users to upload images or take photos directly through the search interface. Google Lens identifies specific objects within the images, while the Gemini model interprets the overall content and context. This combination allows AI Mode to understand the entire scene, including how objects relate to each other, their materials, colors, shapes, and arrangements 3

Google employs a "query fan-out" technique, where the system asks multiple sub-queries about both the image and the objects within it. This approach provides more detailed and contextually relevant information than traditional search methods 2

User Experience and Functionality

Users can now engage in more complex interactions with AI Mode. For example, a user could take a photo of their bookshelf and ask, "If I enjoyed these, what are some similar books that are highly rated?" The system would then identify each book, provide recommendations, and even allow for follow-up questions to refine the search results 2

Expansion of Access

Initially launched exclusively for Google One AI Premium subscribers, Google is now expanding access to AI Mode to "millions more Labs users in the US" who aren't paying for AI features 1

. This move suggests that Google is confident in the feature's performance and is preparing for a broader rollout.

Implications for Search and Competition

Google's AI Mode represents a significant shift in how users interact with search engines. The company reports that early telemetry shows users inputting about twice as much text in their searches compared to traditional web search, indicating a more detailed and conversational approach 1

This development positions Google to compete more effectively with emerging AI-powered search alternatives like Perplexity and OpenAI's ChatGPT Search 3

. By combining real-time web-based answers with a conversational AI interface, Google aims to offer both the accuracy of a search engine and the interactivity of a chatbot 5

Future Outlook

As Google continues to refine and expand AI Mode, it's clear that the company sees this as a key strategy to maintain its dominant position in the search market. The integration of multimodal capabilities and the expansion of access suggest that AI-powered, conversational search experiences may become the new norm for internet users in the near future 1