2 Sources
[1]
OpenAI's experimental model achieved gold at the International Math Olympiad
OpenAI has achieved "gold medal-level performance" at the International Math Olympiad, notching another important milestone for AI's fast-paced growth. Alexander Wei, a research scientist at OpenAI working on LLMs and reasoning, posted on X that an experimental research model delivered on this "longstanding grand challenge in AI." According to Wei, an unreleased model from OpenAI was able to solve five out of six problems at one of the world's longest-standing and prestigious math competitions, earning 35 out of 42 points total. The International Math Olympiad (IMO) sees countries send up to six students to solve extremely difficult algebra and pre-calculus problems. These exercises are seemingly simple but usually require some creativity to score the highest marks on each problem. For this year's competition, only 67 of the 630 total contestants received gold medals, or roughly 10 percent. AI is often tasked with tackling complex datasets and repetitive actions, but it usually falls short when it comes to solving problems that require more creativity or complex decision-making. However, with the latest IMO competition, OpenAI says its model was able to handle complicated math problems with human-like reasoning. "By doing so, we've obtained a model that can craft intricate, watertight arguments at the level of human mathematicians," Wei wrote on X. Wei and Sam Altman, CEO of OpenAI, both added that the company doesn't expect to release anything with this level of math capability for several months. That means the upcoming GPT-5 will likely be an improvement from its predecessor, but it won't feature that same impressive capability to compete in the IMO.
[2]
OpenAI's Reasoning Model Wins Gold at 2025 IMO, GPT-5 Coming Soon | AIM
An experimental large language model (LLM) developed by OpenAI has achieved gold medal-level performance at the 2025 International Math Olympiad (IMO), a milestone in AI reasoning capabilities. Announcing the result on X, OpenAI researcher Alexander Wei said, "Our latest experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world's most prestigious math competition -- the International Math Olympiad." The model was evaluated under the same conditions as human contestants, including two 4.5-hour sessions, no access to tools or internet, and writing detailed proofs based on official IMO problems. The AI successfully solved 5 out of 6 problems, earning 35 out of 42 possible points. Three former IMO medalists graded each solution independently, with final scores based on unanimous agreement. IMO problems are widely regarded as some of the most difficult in competitive mathematics, requiring extended periods of creative reasoning. Wei contextualised the achievement by noting the progression of reasoning benchmarks: "We've now progressed from GSM8K (~0.1 min for top humans) → MATH benchmark (~1 min) → AIME (~10 mins) → IMO (~100 mins)." He added that IMO problems "demand a new level of sustained creative thinking" and that the model's performance demonstrates progress in "general-purpose reinforcement learning and test-time compute scaling." The model is not being released to the public in the near term. "The IMO gold LLM is an experimental research model. We don't plan to release anything with this level of math capability for several months," Wei clarified. While OpenAI plans to release GPT-5 soon, the IMO-capable system is part of a separate research track. "We are releasing GPT-5 soon, and we're excited for you to try it," said Wei. Meanwhile, Yuchen Jin, co-founder of Hyperbolic Labs, also suggested on X that the launch of GPT-5 may be imminent. According to Jin, GPT-5 will not be a single model but a system of multiple specialised models, with a router that dynamically switches between models optimised for reasoning, non-reasoning, and tool use. He added that this architecture is likely why OpenAI CEO Sam Altman previously spoke about "fixing model naming," as users would no longer need to select a specific model, with prompts automatically routed to the most suitable one. Jin also noted that GPT-6 is already in training. "I just hope they're not delaying it for more safety tests," he wrote. Wei also acknowledged the broader implications. "This underscores how fast AI has advanced in recent years. In 2021, my PhD advisor, Jacob Steinhardt, had me forecast AI math progress by July 2025. I predicted 30% on the MATH benchmark... Instead, we have IMO gold." Wei credited Sheryl Hsu, Noam Brown, and others for their role in the research.Last year, Google DeepMind's AlphaProof and AlphaGeometry 2 solved four out of six problems from this year's International Mathematical Olympiad (IMO), achieving a score equivalent to a silver medalist in the competition.
Share
Copy Link
OpenAI's latest experimental AI model has demonstrated gold medal-level performance at the 2025 International Math Olympiad, solving 5 out of 6 problems and scoring 35 out of 42 points. This achievement marks a significant milestone in AI's reasoning capabilities.
In a groundbreaking development for artificial intelligence, OpenAI has announced that its experimental model has achieved "gold medal-level performance" at the 2025 International Math Olympiad (IMO). This achievement marks a significant milestone in AI's ability to tackle complex mathematical problems requiring creative reasoning 1.
Source: Analytics India Magazine
The unreleased AI model successfully solved five out of six problems at the IMO, earning an impressive 35 out of 42 points. This performance places it on par with the top 10% of human contestants who typically receive gold medals in this prestigious competition 1.
Alexander Wei, a research scientist at OpenAI, emphasized the significance of this accomplishment, stating that the model can now "craft intricate, watertight arguments at the level of human mathematicians" 1.
The AI model was evaluated under the same rigorous conditions as human participants:
Three former IMO medalists independently graded each solution, with final scores based on unanimous agreement 2.
This achievement represents a significant leap in AI's reasoning capabilities. Wei contextualized the progress by noting the progression of reasoning benchmarks:
"We've now progressed from GSM8K (~0.1 min for top humans) → MATH benchmark (~1 min) → AIME (~10 mins) → IMO (~100 mins)" 2.
The success at the IMO demonstrates advancements in "general-purpose reinforcement learning and test-time compute scaling" [2](https://analyticsindiamag.com/ai-news-updates/openais-reasoning-model-wins-gold-at-2025-imo-gpt-5-coming-soon()].
While this experimental model showcases impressive capabilities, OpenAI does not plan to release anything with this level of math capability for several months. The upcoming GPT-5, which is expected to be released soon, will likely be an improvement from its predecessor but won't feature the same level of mathematical prowess as the IMO-winning model 1 2.
Source: engadget
This accomplishment surpasses previous AI performances in mathematical competitions. Last year, Google DeepMind's AlphaProof and AlphaGeometry 2 solved four out of six problems from the IMO, achieving a score equivalent to a silver medalist 2.
OpenAI's achievement at the IMO underscores the rapid advancement of AI in recent years, surpassing earlier predictions and setting new benchmarks for machine intelligence in complex problem-solving tasks.
Summarized by
Navi
[2]
Meta, under Mark Zuckerberg's leadership, is making a massive investment in AI, aiming to develop "superintelligence" with a new elite team and billions in infrastructure spending.
2 Sources
Technology
22 hrs ago
2 Sources
Technology
22 hrs ago
As AI chatbots like ChatGPT gain popularity, users must be aware of their limitations and potential risks. This article explores scenarios where using AI chatbots may be inappropriate or dangerous, emphasizing the importance of responsible AI usage.
2 Sources
Technology
22 hrs ago
2 Sources
Technology
22 hrs ago
Nvidia encounters production obstacles for its H20 AI chips intended for the Chinese market, despite plans to resume sales amid U.S. export restrictions.
2 Sources
Business and Economy
14 hrs ago
2 Sources
Business and Economy
14 hrs ago
Meta's data center in Newton County, Georgia, is linked to water scarcity issues, highlighting the environmental impact of AI infrastructure on local communities.
2 Sources
Technology
14 hrs ago
2 Sources
Technology
14 hrs ago
Valve co-founder Gabe Newell discusses the potential impact of AI on game development, suggesting that AI tools could make non-programmers more effective than experienced developers in creating value.
3 Sources
Technology
1 day ago
3 Sources
Technology
1 day ago