Google DeepMind unveils Gemini Robotics-ER 1.6 as robot dogs gain precision instrument reading

Reviewed byNidhi Govil

2 Sources

Share

Google DeepMind launched Gemini Robotics-ER 1.6, a foundation AI model for robotics that enables robot dogs like Boston Dynamics' Spot to read analog instruments with 98% accuracy. The model introduces agentic vision combining visual reasoning with code execution, allowing robots to interpret complex gauges, thermometers, and pressure readings in industrial environments. This marks a significant leap from the 23% accuracy of its predecessor.

Google DeepMind Launches Foundation AI Model for Robotics

Google DeepMind announced Gemini Robotics-ER 1.6 on April 14, positioning it as a high-level reasoning model designed to enhance robotics capabilities in physical environments

1

. The foundation AI model for robotics represents a substantial upgrade in how robots interact with real-world settings, particularly in industrial environments where precision matters. Developed through ongoing collaboration with Boston Dynamics, the model enables robot dogs like Spot to perform complex visual reasoning tasks that were previously challenging for automated systems

2

.

Source: Ars Technica

Source: Ars Technica

The model provides native tools for task planning, including integration with Google Search to find information, vision-language-action models, and third-party user-defined functions to extend capabilities

2

. This architecture allows robots to plan and execute tasks with greater autonomy across factories, warehouses, and other demanding settings.

Dramatic Accuracy Improvements in Reading Analog Instruments

The most striking advancement in Gemini Robotics-ER 1.6 centers on its ability to read analog instruments with unprecedented accuracy. The model achieves 98% accuracy on instrument reading tasks, a massive jump from the 23% accuracy delivered by the older Gemini Robotics-ER 1.5 model

1

.

Source: SiliconANGLE

Source: SiliconANGLE

For context, even Gemini 3.0 Flash managed only 67% accuracy on similar tasks.

This capability proves critical for robots performing visual inspections in industrial settings. Boston Dynamics' Spot, currently being trialed as a robotic inspector throughout industrial facilities including Hyundai Motor Group's automotive factories, must interpret multiple needles, liquid levels, container boundaries, tick marks, and text across various instruments

1

. The model now enables these robot dogs to accurately read analog thermometers and pressure gauges while roaming autonomously through complex environments.

Agentic Vision Powers Embodied Reasoning Capabilities

The breakthrough in instrument reading comes from what Google DeepMind calls "agentic vision," a capability first introduced in Gemini 3.0 Flash in January 2026

1

. This approach combines visual reasoning with code execution to create a "visual scratchpad" for inspecting and manipulating images. The model takes a snapshot, resolves fine details, uses curated code to estimate proportions and intervals for accurate readings, then applies its reasoning engine to interpret the data

2

.

Even without agentic vision, the baseline Gemini Robotics-ER 1.6 model achieves 86% accuracy in reading instruments by using a process of pointing to different elements in visual images to process complex tasks

1

. This embodied reasoning capability extends to counting items and identifying salient features in cluttered environments.

Enhanced Spatial Reasoning and Precision Object Detection

Beyond gauge reading, the model demonstrates significant improvements in spatial reasoning and precision object detection. In one test, Gemini Robotics-ER 1.6 correctly identified the number of hammers, scissors, paintbrushes, pliers, and gardening tools in a cluttered image

1

. The older model failed to accurately count hammers or paintbrushes, completely missed scissors, and hallucinated a nonexistent wheelbarrow. This reduction in hallucination suggests the newer model moves closer to reliable performance, though it remains far from human-level comprehension.

The model also excels at relational logic, such as identifying the smallest object in a set or defining from-to relationships when moving objects

2

. Enhanced trajectory mapping helps robots determine optimal paths for grabbing objects, while improved multi-view reasoning allows robotic systems to use multiple camera streams for better environmental understanding

1

.

Safer Decision-Making for Physical AI Systems

Google DeepMind describes Gemini Robotics-ER 1.6 as its "safest robotics model yet," with substantially improved capacity to adhere to physical safety constraints

1

. The model enables robots to follow safety instructions and make safer decision-making choices when handling liquids or materials. It can more accurately perceive injury risk to humans in different scenarios, such as recognizing when a young child approaches an electrical socket.

Marco da Silva, vice president and general manager of Spot at Boston Dynamics, emphasized that "capabilities like instrument reading and more reliable task reasoning will enable Spot to see, understand, and react to real-world challenges completely autonomously"

2

. This autonomy carries both promise and risk as robots transition from highly specialized factory assembly line roles to free-range workers in less controlled environments.

What This Means for Industrial Robotics

The practical value of AI models like Gemini Robotics-ER 1.6 will emerge as robotics companies gain hands-on testing experience. Historically, robots have proven most efficient performing repetitive tasks in controlled settings. Google bets that advanced models can help robots operate effectively in complex real-world environments, though this also increases the risk of damage or harm if systems fail

1

.

For industries relying on regular inspections across large facilities, the ability of robot dogs to autonomously read gauges and thermometers could reduce labor costs and improve monitoring consistency. The model's improvements in understanding constraints—such as reasoning through prompts like "point to every object small enough to fit inside the blue cup"—suggest potential applications beyond industrial inspection, including warehouse sorting and domestic assistance

2

. Watch for deployment announcements from Boston Dynamics and other robotics firms as they integrate this technology into commercial products.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo