Apple Study Challenges AI Reasoning Capabilities, Casting Doubt on AGI Claims

Reviewed byNidhi Govil

22 Sources

Apple researchers find that advanced AI reasoning models struggle with complex problem-solving, suggesting fundamental limitations in their ability to generalize reasoning like humans do.

Apple Researchers Challenge AI Reasoning Capabilities

A new study from Apple researchers has cast doubt on the capabilities of advanced AI reasoning models, challenging claims about imminent artificial general intelligence (AGI). The research, titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity," was conducted by a team led by Parshin Shojaee and Iman Mirzadeh 1.

Source: NDTV Gadgets 360

Source: NDTV Gadgets 360

Study Methodology and Findings

The researchers examined "large reasoning models" (LRMs), including OpenAI's o1 and o3, DeepSeek-R1, and Claude 3.Sonnet Thinking. These models attempt to simulate logical reasoning through a process called "chain-of-thought reasoning" 1. The study used four classic puzzles - Tower of Hanoi, checkers jumping, river crossing, and blocks world - scaled from easy to extremely complex 1.

Source: Mashable

Source: Mashable

Key findings include:

  1. On simple tasks, standard models outperformed reasoning models.
  2. For moderately difficult tasks, reasoning models had an advantage.
  3. On highly complex tasks, both types of models failed completely 13.

The researchers also observed a "counterintuitive scaling limit" where reasoning models initially generated more thinking tokens as problem complexity increased, but then reduced their reasoning effort beyond a certain threshold 1.

Implications for AI Development

These results align with a recent study by the United States of America Mathematical Olympiad (USAMO), which found that the same models achieved low scores on novel mathematical proofs 1. Both studies documented severe performance degradation on problems requiring extended systematic reasoning.

AI researcher Gary Marcus, known for his skepticism, called the Apple results "pretty devastating to LLMs" 1. The study provides empirical support for the argument that neural networks struggle with out-of-distribution generalization.

Competing Interpretations

Not all researchers agree with the interpretation that these results demonstrate fundamental reasoning limitations. Some argue that the observed limitations may reflect deliberate training constraints rather than inherent inabilities 1.

University of Toronto economist Kevin A. Bryan suggested that models are specifically trained through reinforcement learning to avoid excessive computation, which could explain the observed behavior 1. Software engineer Sean Goedecke offered a similar critique, noting that when faced with extremely complex tasks, models like DeepSeek-R1 may decide that generating all moves manually is impossible and attempt to find shortcuts 1.

Broader Context and Industry Claims

The study's findings contrast sharply with recent claims by AI industry leaders. Sam Altman of OpenAI and Demis Hassabis of Google DeepMind have made bold predictions about AI capabilities in the 2030s, including solving high-energy physics problems and enabling space colonization 2.

Source: The Register

Source: The Register

However, researchers working with today's most advanced AI systems are finding a different reality. Even the best models are failing to solve basic puzzles that most humans find trivial, while the promise of AI that can "reason" seems to be overblown 24.

Limitations and Future Directions

The Apple researchers acknowledge that their study represents only a "narrow slice" of potential reasoning tasks 5. However, their findings suggest that current approaches to AI development may be encountering fundamental barriers to generalizable reasoning 4.

As the AI industry continues to invest heavily in developing more advanced models, with reports of Meta planning a $15 billion investment to achieve "superintelligence" 2, these research findings highlight the need for a critical examination of AI capabilities and limitations. The gap between industry claims and research findings underscores the importance of continued rigorous testing and evaluation of AI systems as they evolve.

Explore today's top stories

Google Offers Free Weekend Access to Gemini's Veo 3 AI Video Generation Tool

Google is providing free users of its Gemini app temporary access to the Veo 3 AI video generation tool, typically reserved for paying subscribers, for a limited time this weekend.

Android Police logo9to5Google logoTechRadar logo

3 Sources

Technology

23 hrs ago

Google Offers Free Weekend Access to Gemini's Veo 3 AI

UK Government Considers Nationwide ChatGPT Plus Access in Talks with OpenAI

The UK's technology secretary and OpenAI's CEO discussed a potential multibillion-pound deal to provide ChatGPT Plus access to all UK residents, highlighting the government's growing interest in AI technology.

The Guardian logoDigital Trends logo

2 Sources

Technology

7 hrs ago

UK Government Considers Nationwide ChatGPT Plus Access in

AI-Generated Articles Slip Through Editorial Filters at Major Publications

Multiple news outlets, including Wired and Business Insider, have been duped by AI-generated articles submitted under a fake freelancer's name, raising concerns about the future of journalism in the age of artificial intelligence.

Wired logoThe Guardian logoFuturism logo

4 Sources

Technology

2 days ago

AI-Generated Articles Slip Through Editorial Filters at

Google's New Gemini-Powered Smart Speaker: A Glimpse into the Future of AI Home Assistants

Google inadvertently revealed a new smart speaker during its Pixel event, sparking speculation about its features and capabilities. The device is expected to be powered by Gemini AI and could mark a significant upgrade in Google's smart home offerings.

engadget logoGizmodo logoPCWorld logo

5 Sources

Technology

1 day ago

Google's New Gemini-Powered Smart Speaker: A Glimpse into

The Evolution of Search: How AI and Changing User Behavior Are Reshaping Digital Marketing

As AI and new platforms transform search behavior, brands must adapt their strategies beyond traditional SEO to remain visible in an increasingly fragmented digital landscape.

Gulf Business logoCampaign India logo

2 Sources

Technology

1 day ago

The Evolution of Search: How AI and Changing User Behavior
TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo