AI Models Struggle with Abstract Visual Reasoning, Falling Short of Human Capabilities

Curated by THEOUTPOST

On Fri, 11 Oct, 12:02 AM UTC

4 Sources

Share

A study by USC researchers reveals that AI models, particularly open-source ones, struggle with abstract visual reasoning tasks similar to human IQ tests. While closed-source models like GPT-4V perform better, they still fall short of human cognitive abilities.

AI Models Face Challenges in Abstract Visual Reasoning

Researchers from the USC Viterbi School of Engineering Information Sciences Institute (ISI) have conducted a groundbreaking study to assess the capabilities of artificial intelligence in solving abstract visual puzzles similar to those found in human IQ tests. The study, presented at the Conference on Language Modeling (COLM 2024) in Philadelphia, reveals significant limitations in AI's ability to perform nonverbal abstract reasoning tasks 1.

Study Methodology and Findings

The research team, led by Kian Ahrabian and Zhivar Sourati, tested 24 different multi-modal large language models (MLLMs) using puzzles based on Raven's Progressive Matrices, a standard test of abstract reasoning. The results showed a stark contrast between open-source and closed-source AI models 2.

Open-source models performed poorly, with Ahrabian stating, "They were really bad. They couldn't get anything out of it." In contrast, closed-source models like GPT-4V demonstrated better performance, though still far from matching human cognitive abilities 3.

Identifying AI's Stumbling Blocks

The researchers delved deeper to understand where the AI models were failing. They discovered that the issue was not limited to visual processing but extended to the reasoning process itself. Even when provided with detailed textual descriptions of the images, many models struggled to reason effectively 4.

Improving AI Performance

To enhance AI performance, the team explored a technique called "Chain of Thought prompting." This method guides the AI through step-by-step reasoning tasks and led to significant improvements in some cases. Ahrabian noted, "By guiding the models with hints, we were able to see up to 100% improvement in performance" 2.

Implications and Future Directions

Jay Pujara, research associate professor and author of the study, emphasized the importance of understanding AI's limitations: "We still have such a limited understanding of what new AI models can do, and until we understand these limitations, we can't make AI better, safer, and more useful" 1.

The study's findings highlight both the current limitations of AI and the potential for future advancements. As AI models continue to evolve, this research could pave the way for developing AI systems that can not only understand but also reason in ways more comparable to human cognition 4.

Continue Reading
Apple Study Reveals Limitations in AI's Mathematical

Apple Study Reveals Limitations in AI's Mathematical Reasoning Abilities

A recent study by Apple researchers exposes significant flaws in the mathematical reasoning capabilities of large language models (LLMs), challenging the notion of AI's advanced reasoning skills and raising questions about their real-world applications.

PYMNTS.com logoWired logoFuturism logoTechRadar logo

17 Sources

PYMNTS.com logoWired logoFuturism logoTechRadar logo

17 Sources

Study Reveals GPT Models Struggle with Flexible Reasoning,

Study Reveals GPT Models Struggle with Flexible Reasoning, Highlighting Limitations in AI Cognition

A new study from the University of Amsterdam and Santa Fe Institute shows that while GPT models perform well on standard analogy tasks, they struggle with variations, indicating limitations in AI's reasoning capabilities compared to humans.

ScienceDaily logoTech Xplore logo

2 Sources

ScienceDaily logoTech Xplore logo

2 Sources

FrontierMath: New AI Benchmark Exposes Limitations in

FrontierMath: New AI Benchmark Exposes Limitations in Advanced Mathematical Reasoning

Epoch AI's FrontierMath, a new mathematics benchmark, reveals that leading AI models struggle with complex mathematical problems, solving less than 2% of the challenges.

pcgamer logoArs Technica logoPhys.org logoVentureBeat logo

8 Sources

pcgamer logoArs Technica logoPhys.org logoVentureBeat logo

8 Sources

New AI Benchmark 'Humanity's Last Exam' Stumps Top Models,

New AI Benchmark 'Humanity's Last Exam' Stumps Top Models, Revealing Limits of Current AI

Scale AI and the Center for AI Safety have introduced a challenging new AI benchmark called 'Humanity's Last Exam', which has proven difficult for even the most advanced AI models, highlighting the current limitations of artificial intelligence.

ZDNet logoQuartz logoTechRadar logoAnalytics India Magazine logo

7 Sources

ZDNet logoQuartz logoTechRadar logoAnalytics India Magazine logo

7 Sources

Apple Research Exposes Fundamental Flaws in AI's Logical

Apple Research Exposes Fundamental Flaws in AI's Logical Reasoning Capabilities

Apple researchers conducted tests revealing significant limitations in AI models' ability to perform simple arithmetic and logical reasoning, raising questions about the true intelligence of current AI systems.

9to5Mac logoMiami Herald logo

2 Sources

9to5Mac logoMiami Herald logo

2 Sources

TheOutpost.ai

Your one-stop AI hub

The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.

© 2025 TheOutpost.AI All rights reserved