Google's Gemini 3 Deep Think achieves gold medal standards in Math and Physics Olympiads

Reviewed byNidhi Govil

3 Sources

Share

Google has launched a major upgrade to Gemini 3 Deep Think, its specialized AI reasoning model designed for advanced math and science research. The model achieved gold-medal performance in the 2025 International Math Olympiad and Physics Olympiads, while scoring 84.6% on the ARC-AGI-2 benchmark. Real-world applications include identifying flaws in peer-reviewed papers and optimizing semiconductor fabrication at leading universities.

Google Unveils Major Upgrade to Specialized AI Reasoning Model

Google has released a significant update to Gemini 3 Deep Think, positioning the specialized AI reasoning model as a critical partner for tackling complex problems in scientific research and engineering. This major upgrade to AI model marks a shift from general-purpose tools toward systems built specifically to navigate the messy realities of advanced academia, where problems often lack clear guardrails or single correct solutions

1

2

. Developed through close partnership with scientists and researchers, the model blends deep scientific knowledge with engineering utility to drive practical applications in science across chemistry, computer science, and physics

1

.

Source: Digit

Source: Digit

Gemini 3 Deep Think Achieves Gold Medal Standards in Advanced Math and Science

The updated model has demonstrated unprecedented performance on the world's most challenging academic benchmarks. Gemini 3 Deep Think achieved gold medal standards in the 2025 International Math Olympiad, proving its capacity for abstract logic and creative problem-solving at the highest competitive levels. The model's expertise extends beyond mathematics, delivering gold-medal level results on written sections of the 2025 International Physics and Chemistry Olympiads. These achievements suggest the model has moved beyond pattern matching into deep first-principles reasoning territory, fundamentally shifting how AI research approaches scientific challenges.

The model recorded a staggering 84.6% on the ARC-AGI-2 benchmark, which specifically measures fluid intelligence and the ability to learn new concepts on the fly. In competitive programming, it attained an Elo rating of 3455 on Codeforces, placing it among the elite tier of human coders. Perhaps most impressively, it scored 48.4% on "Humanity's Last Exam," a benchmark composed of questions specifically designed by experts to be nearly impossible for contemporary AI to solve without specialized tools.

Source: 9to5Google

Source: 9to5Google

Real-World Applications Transform Scientific Research and Engineering

The reasoning capabilities of Gemini 3 Deep Think are already delivering tangible results in research environments where human peer review reaches its limits. At Rutgers University, researchers used the model to review a highly technical mathematics paper focusing on the intersection of Einstein's theory of gravity and quantum mechanics. In a field where training data is scarce and logic incredibly dense, the model successfully identified a subtle logical flaw that had remained unnoticed during traditional human peer review. This ability to act as a high-level auditor could fundamentally change how academic research is verified and published.

At Duke University's Wang Lab, researchers utilized the model to optimize fabrication methods for semiconductor materials, successfully designing a precise recipe for growing thin films larger than 100 micrometers—a target that had previously eluded researchers using standard methodologies. The model allows researchers to interpret complex data and engineers to model physical systems through code

2

. One practical feature enables the model to analyze a simple hand-drawn sketch and transform it into a 3D printing-ready file, streamlining the prototyping process from basic concept to physical part

2

.

Aletheia Agent Brings Autonomous Research Capabilities

Google built out a math research agent called Aletheia that can conduct autonomous research or collaborate with humans on advanced math and science challenges

1

. The agent can "admit failure to solve a problem," which improved efficiency for researchers by avoiding wasted time on unsolvable approaches

1

. Google published papers resulting from this technology spanning diverse fields from information and complexity theory to cryptography and mechanism design, demonstrating how AI is fundamentally shifting research

1

. The AI model uses Google's search to avoid inaccuracies and wrongful citations when conducting research

1

.

Access Through Google AI Ultra and Gemini API

The updated Gemini 3 Deep Think is now available in the Gemini app for Google AI Ultra subscribers

1

2

. Google is also making it available via the Gemini API for enterprise users, with an early access program allowing enterprises, researchers, and independent engineers to integrate these deep reasoning capabilities into custom applications

2

. This release is part of a broader push by leading AI developers to build more advanced tools that can field everything from complex coding to scientific research, with competitors like Anthropic recently releasing models for financial research and legal services

1

. The next frontier isn't just about faster answers, but about more profound, verified logic that can solve the world's most complex scientific mysteries.

Source: Bloomberg

Source: Bloomberg

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo