Study Reveals GPT Models Struggle with Flexible Reasoning, Highlighting Limitations in AI Cognition

2 Sources

A new study from the University of Amsterdam and Santa Fe Institute shows that while GPT models perform well on standard analogy tasks, they struggle with variations, indicating limitations in AI's reasoning capabilities compared to humans.

News article

GPT Models Struggle with Flexible Reasoning

A groundbreaking study conducted by researchers from the University of Amsterdam and the Santa Fe Institute has shed light on the limitations of artificial intelligence (AI) in replicating human-like reasoning. The research, published in Transactions on Machine Learning Research, focused on comparing the performance of GPT models with human cognition in analogical reasoning tasks 12.

Understanding Analogical Reasoning

Analogical reasoning, a fundamental aspect of human cognition, involves drawing comparisons between different concepts based on shared similarities. This ability is crucial for understanding the world and making decisions. For instance, recognizing that "cup is to coffee as soup is to bowl" demonstrates this type of reasoning 12.

Study Methodology and Findings

The study, led by Martha Lewis from the Institute for Logic, Language and Computation at the University of Amsterdam and Melanie Mitchell from the Santa Fe Institute, examined the performance of GPT models and humans on three types of analogy problems. Importantly, the researchers also tested how well both groups handled subtle modifications to these problems 12.

GPT Models' Performance on Standard vs. Modified Tasks

While GPT models showed impressive capabilities in solving standard analogy problems, they struggled significantly when faced with variations of these tasks. This contrast was particularly evident in several areas:

  1. Digit Matrices: GPT models' performance dropped noticeably when the position of the missing number was altered, whereas humans had no such difficulty 12.

  2. Story Analogies: GPT-4 showed a bias towards selecting the first given answer as correct, a tendency not observed in human participants. The AI also had more trouble than humans when key story elements were reworded 12.

  3. Simple Analogy Tasks: On simpler tasks, GPT models' performance declined with modifications, while humans maintained consistent results 12.

Implications for AI Understanding and Generalization

The research challenges the assumption that AI models like GPT-4 can reason in ways comparable to human cognition. Lewis explains, "This suggests that AI models often reason less flexibly than humans and their reasoning is less about true abstract understanding and more about pattern matching" 12.

Critical Considerations for AI Application

These findings raise important considerations for the deployment of AI in critical decision-making domains such as education, law, and healthcare. While AI remains a powerful tool, the study emphasizes that it is not yet a suitable replacement for human reasoning and thinking 12.

Future of AI and Human Cognition

The research underscores the need for continued development in AI to achieve more robust and flexible reasoning capabilities. As AI increasingly integrates into various aspects of society, understanding its limitations and strengths becomes crucial for responsible implementation and development 12.

Explore today's top stories

Google Unveils Pixel 10 Series: AI-Powered Features and Camera Upgrades Take Center Stage

Google has launched its new Pixel 10 series, featuring improved AI capabilities, camera upgrades, and the new Tensor G5 chip. The lineup includes the Pixel 10, Pixel 10 Pro, and Pixel 10 Pro XL, with prices starting at $799.

Ars Technica logoTechCrunch logoCNET logo

60 Sources

Technology

14 hrs ago

Google Unveils Pixel 10 Series: AI-Powered Features and

Google Unveils AI-Powered Pixel 10 Smartphones with Advanced Gemini Features

Google launches its new Pixel 10 smartphone series, showcasing advanced AI capabilities powered by Gemini, aiming to compete with Apple in the premium handset market.

Bloomberg Business logoThe Register logoReuters logo

22 Sources

Technology

13 hrs ago

Google Unveils AI-Powered Pixel 10 Smartphones with

NASA and IBM Unveil Surya: An AI Model to Predict Solar Flares and Space Weather

NASA and IBM have developed Surya, an open-source AI model that can predict solar flares and space weather with improved accuracy, potentially helping to protect Earth's infrastructure from solar storm damage.

New Scientist logoengadget logoGizmodo logo

6 Sources

Technology

22 hrs ago

NASA and IBM Unveil Surya: An AI Model to Predict Solar

Google Unveils Pixel Watch 4: A Leap Forward in AI-Powered Wearables

Google's latest smartwatch, the Pixel Watch 4, introduces significant upgrades including a curved display, AI-powered features, and satellite communication capabilities, positioning it as a strong competitor in the smartwatch market.

TechCrunch logoCNET logoZDNet logo

18 Sources

Technology

13 hrs ago

Google Unveils Pixel Watch 4: A Leap Forward in AI-Powered

FieldAI Secures $405M Funding to Revolutionize Robot Intelligence with Physics-Based AI Models

FieldAI, a robotics startup, has raised $405 million to develop "foundational embodied AI models" for various robot types. The company's innovative approach integrates physics principles into AI, enabling safer and more adaptable robot operations across diverse environments.

TechCrunch logoReuters logoGeekWire logo

7 Sources

Technology

14 hrs ago

FieldAI Secures $405M Funding to Revolutionize Robot
TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo