AI Models Struggle with Abstract Visual Reasoning, Falling Short of Human Capabilities

4 Sources

A study by USC researchers reveals that AI models, particularly open-source ones, struggle with abstract visual reasoning tasks similar to human IQ tests. While closed-source models like GPT-4V perform better, they still fall short of human cognitive abilities.

News article

AI Models Face Challenges in Abstract Visual Reasoning

Researchers from the USC Viterbi School of Engineering Information Sciences Institute (ISI) have conducted a groundbreaking study to assess the capabilities of artificial intelligence in solving abstract visual puzzles similar to those found in human IQ tests. The study, presented at the Conference on Language Modeling (COLM 2024) in Philadelphia, reveals significant limitations in AI's ability to perform nonverbal abstract reasoning tasks 1.

Study Methodology and Findings

The research team, led by Kian Ahrabian and Zhivar Sourati, tested 24 different multi-modal large language models (MLLMs) using puzzles based on Raven's Progressive Matrices, a standard test of abstract reasoning. The results showed a stark contrast between open-source and closed-source AI models 2.

Open-source models performed poorly, with Ahrabian stating, "They were really bad. They couldn't get anything out of it." In contrast, closed-source models like GPT-4V demonstrated better performance, though still far from matching human cognitive abilities 3.

Identifying AI's Stumbling Blocks

The researchers delved deeper to understand where the AI models were failing. They discovered that the issue was not limited to visual processing but extended to the reasoning process itself. Even when provided with detailed textual descriptions of the images, many models struggled to reason effectively 4.

Improving AI Performance

To enhance AI performance, the team explored a technique called "Chain of Thought prompting." This method guides the AI through step-by-step reasoning tasks and led to significant improvements in some cases. Ahrabian noted, "By guiding the models with hints, we were able to see up to 100% improvement in performance" 2.

Implications and Future Directions

Jay Pujara, research associate professor and author of the study, emphasized the importance of understanding AI's limitations: "We still have such a limited understanding of what new AI models can do, and until we understand these limitations, we can't make AI better, safer, and more useful" 1.

The study's findings highlight both the current limitations of AI and the potential for future advancements. As AI models continue to evolve, this research could pave the way for developing AI systems that can not only understand but also reason in ways more comparable to human cognition 4.

Explore today's top stories

NVIDIA Unveils Major GeForce NOW Upgrade with RTX 5080 Performance and Expanded Game Library

NVIDIA announces significant upgrades to its GeForce NOW cloud gaming service, including RTX 5080-class performance, improved streaming quality, and an expanded game library, set to launch in September 2025.

CNET logoengadget logoPCWorld logo

9 Sources

Technology

13 hrs ago

NVIDIA Unveils Major GeForce NOW Upgrade with RTX 5080

Google's Pixel 10 Series: AI-Powered Innovations and Hardware Upgrades Unveiled at Made by Google 2025 Event

Google's Made by Google 2025 event showcases the Pixel 10 series, featuring advanced AI capabilities, improved hardware, and ecosystem integrations. The launch includes new smartphones, wearables, and AI-driven features, positioning Google as a strong competitor in the premium device market.

TechCrunch logoengadget logoTom's Guide logo

4 Sources

Technology

13 hrs ago

Google's Pixel 10 Series: AI-Powered Innovations and

Palo Alto Networks Forecasts Strong Growth Driven by AI-Powered Cybersecurity Solutions

Palo Alto Networks reports impressive Q4 results and forecasts robust growth for fiscal 2026, driven by AI-powered cybersecurity solutions and the strategic acquisition of CyberArk.

Reuters logoThe Motley Fool logoInvesting.com logo

6 Sources

Technology

13 hrs ago

Palo Alto Networks Forecasts Strong Growth Driven by

OpenAI Tweaks GPT-5 to Be 'Warmer and Friendlier' Amid User Backlash

OpenAI updates GPT-5 to make it more approachable following user feedback, sparking debate about AI personality and user preferences.

ZDNet logoTom's Guide logoFuturism logo

6 Sources

Technology

21 hrs ago

OpenAI Tweaks GPT-5 to Be 'Warmer and Friendlier' Amid User

Europe's AI Regulations Could Thwart Trump's Deregulation Plans

President Trump's plan to deregulate AI development in the US faces a significant challenge from the European Union's comprehensive AI regulations, which could influence global standards and affect American tech companies' operations worldwide.

The New York Times logoEconomic Times logo

2 Sources

Policy

5 hrs ago

Europe's AI Regulations Could Thwart Trump's Deregulation
TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo