Probably AI Raises $9M Seed Funding to Fix Hallucinations

Probably AI Secures Seed Funding to Tackle AI Hallucinations

Probably AI has raised $9 million in seed funding co-led by Andreessen Horowitz and Accel, with participation from Tokyo Black and Vermilion Cliffs Ventures, to address one of the most persistent challenges in artificial intelligence: reducing factual errors in LLMs 1

. While most of the industry attempts to fix AI hallucinations by building larger, more powerful large language models, founder Peter Elias is betting on the opposite strategy. The company aims to achieve 99.99% accuracy—the kind of reliability common in deterministic systems but rarely seen in AI—by catching errors before they ever reach users 1

Source: TechCrunch

A Verifiable Data Agent Built on Harness Engineering

Probably AI's first product is a verifiable data agent designed as a data science tool that produces quick answers from complex datasets. Each result includes a citation and audit trails showing how it was developed 1

. What sets this approach apart is what Elias describes as a "data science mech suit"—an elaborate harness system that validates every answer. The process works by having the LLM take a first pass at answering queries, then running those results through a separate deterministic validator system that checks answers against the actual dataset and rejects anything that doesn't match 2

. The LLM has been trained against this validator, and the entire system is optimized for fast and high accuracy AI responses.

Smaller Cheaper Models That Run on Local Hardware

The implications for cost are striking. "What we learned building this was that the better your harness engineering is, the weaker the model can be," Elias explains. "If you can refine the context enough, the model does not have to work very hard to do the right thing. Basically, it's an exercise in reducing ambiguity" 1

. This approach allows Probably's data science tool to run on models that are "four classes weaker than the frontier models," meaning it can operate on local hardware like a desktop computer instead of requiring a data center . This dramatically reduces token costs associated with AI use, a welcome development as companies reassess their AI budgets amid rising expenses 2

Privacy and Precision-Sensitive Fields

The tool runs locally on the open-source database DuckDB, and the company states that the model only sees metadata and statistics, never raw data, which remains on the user's machine 2

. This creates a compelling privacy pitch alongside the cost benefits. Elias envisions extending the same engine to cover precision-sensitive fields like accounting and medical services—"any precision-sensitive use case" where confident wrong answers pose serious risks 1

. Researchers have repeatedly warned about AI hallucinations in science and critical applications, making this approach particularly relevant 2

Why Major Labs Haven't Pursued This Approach

Elias offers a provocative explanation for why big AI labs haven't attempted this strategy: "They're incentivized not to, because they make money the more times you have to correct the model" 1

. While major labs do invest resources in cutting AI hallucinations, the observation highlights a potential misalignment between business models and user needs. The approach does have limitations—a validator only works when there's hard ground truth to check against, like a dataset, which is why Probably started with data rather than open-ended writing 2

. The product is currently in public preview at version 0.1, and the 99.99% accuracy figure remains a goal rather than a proven result. Still, in a market crowded with attempts to tame hallucinations, betting on smaller models wrapped in rigorous validation represents a distinct strategy that attracted backing from Andreessen Horowitz and Accel.

Probably AI Raises $9M to Catch Hallucinations Using Smaller, Cheaper Models

Probably AI Secures Seed Funding to Tackle AI Hallucinations

A Verifiable Data Agent Built on Harness Engineering

Smaller Cheaper Models That Run on Local Hardware

Privacy and Precision-Sensitive Fields

Why Major Labs Haven't Pursued This Approach

References

Probably raises $9M to build a more reliable kind of AI

Probably raises $9M to fix AI hallucinations on cheap models

Related Stories

Scaled Cognition raises $100M from Khosla to eliminate AI hallucinations with new architecture

Mirendil raises $200 million to build self-improving AI that accelerates scientific discovery

Patronus AI Launches API to Combat AI Hallucinations and Enhance Reliability

Recent Highlights

OpenAI and Anthropic AI Models Breach Multiple Companies During Security Tests

Google DeepMind unveils Gemini Robotics 2 with intelligent whole-body control for humanoids

Nvidia forms Open Secure AI Alliance with Microsoft, but OpenAI, Google and Anthropic sit out

Recent Highlights

Today's Top Stories

OpenAI Astra Tackles Ten Open Problems in Mathematics Using Multi-Agent AI for Just $2,000

Rogue AI Models Launch Autonomous Cyberattacks, Raising Untested Legal Questions on Responsibility

Sam Altman's ChatGPT Parenting Suggestion Draws 122,000 Likes on Critical Reply

Chinese Military Researchers Tap US AI Models to Train Defence Systems Via Distillation