AI's Rapid Progress: Closing the Gap on Complex Human Tasks

2 Sources

A new study reveals AI models are rapidly improving their ability to handle complex tasks, potentially matching human performance on month-long projects by 2029. This progress raises both excitement and concerns about AI's future impact on various industries.

News article

AI's Rapid Progress in Tackling Complex Tasks

A groundbreaking study by the Model Evaluation & Threat Research (METR) group has revealed that artificial intelligence (AI) is making significant strides in handling complex, time-consuming tasks traditionally performed by human experts. The research introduces a new metric called the "task-completion time horizon," which measures the duration of tasks that AI models can complete with a 50% success rate compared to human experts 1.

Exponential Growth in AI Capabilities

The study found that the time horizon of leading AI models has been doubling approximately every seven months since 2019. This growth has accelerated in 2024, with the latest models doubling their horizon roughly every three months. At this rate, AI models could potentially handle tasks that take humans about a month to complete with 50% reliability by 2029 1.

Benchmarking AI Against Human Performance

METR created nearly 170 real-world tasks across various domains, including coding, cybersecurity, general reasoning, and machine learning. They established a human baseline by measuring the time taken by expert programmers to complete these tasks. The research team then assessed the progress of AI models against this baseline 1.

Key Findings and Comparisons

  • GPT-2, an early large language model from 2019, failed on all tasks that took human experts more than one minute.
  • Claude 3.7 Sonnet, released in February 2025, completed 50% of the tasks that would take people 59 minutes 1.
  • Today's frontier models, like Claude 3.7 Sonnet, already match human performance on 50-minute-long tasks 2.

Driving Factors Behind AI Progress

The paper attributes the progress in AI's time horizon metric to improvements in several key areas:

  1. Logical reasoning
  2. Tool use
  3. Error correction
  4. Self-awareness in task execution 1

Modern AI models are learning to persist and correct errors, which are critical traits for automation at scale 2.

Implications and Concerns

While the study confirms rapid AI progress, it also raises concerns about potential misuse. As AI systems become capable of extended autonomous operation, new safety measures will be needed to prevent risks such as self-replicating AI or autonomous development of hazardous materials 2.

The implications of this progress stretch beyond software development. Fields like legal research, cybersecurity, and scientific discovery could see AI playing a much larger role in the near future 2.

Limitations and Challenges

Despite the impressive progress, AI still faces challenges in certain areas:

  • Performance drops on "messier" real-world tasks requiring creativity, strategic thinking, or human collaboration.
  • AI excels at structured problems with clear objectives but struggles in unpredictable environments 2.

Some experts, like Joshua Gans from the University of Toronto, caution against over-reliance on these predictions, noting that there is still much uncertainty about how AI will actually be used in practice 1.

Explore today's top stories

AMD Unveils Next-Generation AI Chips, Challenging Nvidia's Dominance

AMD CEO Lisa Su reveals new MI400 series AI chips and partnerships with major tech companies, aiming to compete with Nvidia in the rapidly growing AI chip market.

Reuters logoCNBC logoInvestopedia logo

8 Sources

Technology

46 mins ago

AMD Unveils Next-Generation AI Chips, Challenging Nvidia's

Meta Takes Legal Action Against AI 'Nudify' App Developer in Crackdown on Deepfake Nudes

Meta has filed a lawsuit against Joy Timeline HK Limited, the developer of the AI 'nudify' app Crush AI, for repeatedly violating advertising policies on Facebook and Instagram. The company is also implementing new measures to combat the spread of AI-generated explicit content across its platforms.

TechCrunch logoThe Verge logoPC Magazine logo

17 Sources

Technology

8 hrs ago

Meta Takes Legal Action Against AI 'Nudify' App Developer

Mattel and OpenAI Join Forces to Revolutionize Toy Industry with AI Integration

Mattel, the iconic toy manufacturer, partners with OpenAI to incorporate artificial intelligence into toy-making and content creation, promising innovative play experiences while prioritizing safety and privacy.

TechCrunch logoBloomberg Business logoReuters logo

14 Sources

Business and Economy

8 hrs ago

Mattel and OpenAI Join Forces to Revolutionize Toy Industry

Zero-Click AI Vulnerability "EchoLeak" Exposes Microsoft 365 Copilot Data

A critical security flaw named "EchoLeak" was discovered in Microsoft 365 Copilot, allowing attackers to exfiltrate sensitive data without user interaction. The vulnerability highlights potential risks in AI-integrated systems.

The Hacker News logoBleeping Computer logoSiliconANGLE logo

5 Sources

Technology

16 hrs ago

Zero-Click AI Vulnerability "EchoLeak" Exposes Microsoft

Multiverse Computing Raises $217M for Revolutionary AI Model Compression Technology

Spanish AI startup Multiverse Computing secures $217 million in funding to advance its quantum-inspired AI model compression technology, promising to dramatically reduce the size and cost of running large language models.

Reuters logoCrunchbase News logoSiliconANGLE logo

5 Sources

Technology

8 hrs ago

Multiverse Computing Raises $217M for Revolutionary AI
TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Β© 2025 Triveous Technologies Private Limited
Twitter logo
Instagram logo
LinkedIn logo