Researchers propose new framework to protect users when AI agents fumble with their money

2 Sources

Share

A consortium of researchers from Microsoft Research, Google DeepMind, Columbia University, and AI startups has unveiled the Agentic Risk Standard, a financial protection framework designed to compensate users when AI agents misexecute trades or cause financial losses. The open-source protocol introduces escrow, collateral, and underwriting mechanisms to address what researchers call a 'guarantee gap' between probabilistic AI reliability and the enforceable guarantees users need for high-stakes tasks.

A New Financial Safety Net for AI Agents

As AI agents increasingly handle payments, trades, and financial transactions, a fundamental question has emerged: who pays when these systems make mistakes? Researchers from Microsoft Research, Google DeepMind, Columbia University, Virtuals Protocol, and t54 Labs have proposed the Agentic Risk Standard (ARS), a financial protection framework designed to compensate users when AI agents fail

1

. Published on April 8 and made available as an open-source protocol on GitHub, ARS aims to do for AI agents what escrow, insurance, and clearinghouses do for traditional financial markets

1

.

Source: Decrypt

Source: Decrypt

The framework addresses what researchers identify as a critical vulnerability in the emerging agentic economy. Unlike conventional AI safety research that focuses on reducing the probability of failure, ARS takes a complementary approach by formalizing what happens financially when systems inevitably fail

1

. "Most trustworthy AI research aims to reduce the probability of failure," said Wenyue Hua, Senior Researcher at Microsoft Research. "That work is essential, but probability is not a guarantee. ARS takes a complementary approach: instead of trying to make the model perfect, we formalize what happens financially when it isn't"

1

.

The Guarantee Gap and Why Current Safety Measures Fall Short

The core problem ARS tackles is what researchers call a "guarantee gap"—a disconnect between the probabilistic reliability that AI safety techniques provide and the enforceable guarantees users need before delegating high-stakes tasks

1

. Large language models are inherently stochastic, meaning that no matter how well trained or tuned an AI agent is, it can still hallucinate and make mistakes

1

. When that agent is executing financial API calls or managing brokerage accounts, even a single AI agent failure can produce immediate, realized loss

1

.

"Technical safeguards can offer only probabilistic reliability, whereas users in high-stakes settings often require enforceable guarantees over outcomes," the paper stated

2

. Without a way to bound potential losses, users rationally limit AI delegation to low-risk tasks, constraining broader adoption of agent-based services

1

. The financial ecosystem currently has no way to operate other than to defer all liability back to a human, according to t54 founder Chandler Fang

1

.

How the Settlement Framework Actually Works

The Agentic Risk Standard borrows directly from centuries of financial engineering to manage financial risk through a layered settlement framework

1

. The system distinguishes between two types of AI jobs, each requiring different levels of user protection

1

.

For standard service tasks like generating slide decks or writing reports that carry limited financial exposure, escrow-based settlement is sufficient

1

. Service fees are held in escrow vaults and released only upon verified task delivery

1

. For high-risk financial operations involving currency trading, leveraged positions, or other tasks requiring access to user funds before outcomes can be verified, the framework introduces underwriting

1

.

Underwriting brings in a risk-bearing third party that evaluates the danger of an AI failure, requires service providers to post collateral, charges a premium, and commits to reimbursing users if things go wrong

1

2

. It's the same logic that governs derivatives markets, where clearinghouses stand between counterparties so that a single default doesn't cascade

1

.

Regulatory Attention and Real-World Implementation Challenges

Financial regulators are already circling. FINRA's 2026 regulatory oversight report, released in December, included a first-ever section on generative AI, warning broker-dealers to develop procedures specifically targeting AI hallucinations and to scrutinize AI agents that may act "beyond the user's actual or intended scope and authority"

1

. The SEC and other agencies are watching closely

1

.

ARS is pitched not as a set of rules, but as a protocol—a standardized state machine that governs how funds are locked, how claims are filed, and how reimbursements are triggered when AI agents fail

1

. The researchers tested the system using a simulation that ran 5,000 trials, though they acknowledged the experiment was limited and not designed to reflect real-world failure rates

2

.

The real bottleneck, researchers acknowledge, will be building accurate risk-pricing models for agentic behavior

1

. Accurate failure-rate estimates remain the main challenge, as both over- and underestimation create systemic risks

2

. The paper noted that non-financial harms such as hallucination, defamation, or psychological harm remain outside the framework

2

. Future work will focus on risk modeling for diverse failure modes, empirical measurement of failure frequencies under deployment-like conditions, and designing underwriting and collateral schedules that remain robust under detector error and strategic behavior

2

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo