Gemini 3 Flash's Agentic Vision uses code execution to inspect images step-by-step
Google DeepMind introduced Agentic Vision in Gemini 3 Flash, transforming static image analysis into an active investigative process. The model uses Python code execution to zoom, annotate, and manipulate images through a Think, Act, Observe loop. This approach delivers a 5-10% quality boost across vision benchmarks and is now available through the Gemini API in Google AI Studio and Vertex AI.