Google's Gemini gains screen automation to control Android apps and perform tasks for you

Reviewed byNidhi Govil

6 Sources

Share

Google is developing screen automation capabilities for Gemini that would let the AI agent control other apps on Android devices. Codenamed bonobo, the feature will initially handle tasks like booking rides and placing orders through Uber and food delivery apps. Users maintain full control with manual override options, but privacy concerns emerge as screenshots may be reviewed by human reviewers.

Gemini Evolves Into an AI Agent With Screen Automation

Google is building a major upgrade for Gemini that transforms the AI assistant from a conversational helper into an autonomous work agent capable of performing tasks on behalf of users. A beta teardown of the Google app version 17.4 by 9to5Google reveals a feature called "screen automation" that would let Gemini control other apps directly on Android devices

2

. Codenamed bonobo internally, this capability represents a significant leap in AI agent capabilities, allowing Gemini to interact with Android app interfaces by tapping buttons and navigating screens just as a human would

4

.

Source: Tom's Guide

Source: Tom's Guide

The functionality builds on Project Astra, which Google demonstrated at I/O 2025, showcasing Gemini's ability to view text and media on phones while scrolling and tapping when needed

1

. This AI assistant upgrade shifts Gemini from simply suggesting actions to actually executing them inside apps without requiring users to manually tap through screens.

Source: Android Authority

Source: Android Authority

Booking Rides and Placing Orders Through Voice Commands

Initially, screen automation will focus on practical everyday tasks like booking rides and placing orders through certain apps. Users could ask Gemini to book an Uber to the office or order dinner from Uber Eats without ever opening the apps themselves

2

. Early support will likely include ride-hailing services like Uber and Lyft, along with food delivery platforms such as DoorDash and Uber Eats

5

.

Google plans to roll out the feature as a Labs feature initially, limiting AI taking actions to a small set of Android apps. This cautious approach makes sense given that app UIs change frequently, and Google needs to ensure reliable performance before expanding compatibility. Beyond ride-hailing and food delivery, early support could extend to first-party Google apps where the company has more direct control over interface stability

2

.

Android 16 QPR3 Lays Technical Groundwork

The screen automation feature requires at least Android 16 QPR3 to function, as Google has laid the necessary technical groundwork in this quarterly platform release

2

. This requirement means the feature won't arrive immediately but will likely launch alongside Android 16 QPR3 in March, according to reports. The timing suggests Google is taking a measured approach to ensure the underlying infrastructure supports reliable agentic functionality.

Privacy Concerns and User Supervision Requirements

While the autonomous work agent capabilities sound convenient, they raise significant privacy concerns. Code strings discovered in the beta reveal that when Gemini interacts with an app, screenshots are reviewed by trained reviewers and used to improve Google services if Keep Activity is enabled

2

. These human reviewers will analyze how Gemini performs tasks, creating a feedback loop for improvement but also introducing potential privacy implications.

Google includes several warnings in the feature documentation. Users are explicitly advised not to enter payment information into Gemini chats and to avoid using screen automation during emergencies or for tasks involving sensitive information

2

. The code also warns that "Gemini can make mistakes" and emphasizes that users remain responsible for what the AI does on their behalf, requiring close supervision

3

.

Crucially, users maintain user control throughout the process. Google has designed the system to allow manual intervention at any time, letting users stop Gemini or take over manually whenever they choose

2

. This safeguard addresses concerns about surrendering too much autonomy to AI assistants.

Implications for Mobile AI and User Adoption

If screen automation rolls out widely, it could fundamentally change how people interact with mobile devices, shifting from direct tapping and swiping to delegating tasks to AI agents

4

. The feature may initially be limited to users on Gemini Pro and Ultra tiers, following Google's pattern of reserving advanced AI features for premium subscribers

2

.

However, adoption may face resistance from users skeptical about handing over control to machines. Some Android users prefer maintaining direct control over their devices and completing tasks themselves, viewing AI automation as an unnecessary layer that introduces potential errors and privacy risks

3

. The success of screen automation will depend on whether Google can demonstrate reliable performance while addressing legitimate concerns about security, oversight, and the implications of letting AI agents handle sensitive workflows like bookings or financial orders.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo