Anthropic Mythos Discovers 271 Firefox Vulnerabilities, Including Bugs Hidden for 15 Years

4 Sources

Share

Mozilla deployed Anthropic Mythos to uncover 271 security flaws in Firefox, resulting in 423 total bug fixes in April—more than 13 times the monthly average. The AI model found decade-old vulnerabilities that traditional fuzzing missed, though skeptics question whether the breakthrough lies in the model itself or Mozilla's custom harness system.

News article

Mozilla Achieves Breakthrough in AI-Assisted Vulnerability Detection

Mozilla fixed 423 Firefox security bugs in April 2026, a repair rate more than 13 times higher than its monthly average and nearly five times the 76 fixes issued in March

3

. The browser maker attributes this surge to Anthropic Mythos, an AI model that discovered 271 vulnerabilities in Firefox 150 over two months

1

. Among these findings were high-severity software vulnerabilities that had remained hidden for over a decade, including a 15-year-old bug in Firefox's <legend> HTML element that survived years of fuzzing and manual audits

4

.

Mozilla Distinguished Engineer Brian Grinstead emphasized that the system produces "almost no false positives," a stark contrast to earlier AI-powered scanning attempts that flooded security teams with hallucinated reports

1

. The breakthrough addresses a persistent challenge in AI-assisted vulnerability detection: generating actionable findings without overwhelming human reviewers.

Custom Agentic Harness Drives Detection Accuracy

The key to Mozilla's success lies in a custom agentic harness—middleware that guides the AI model through specific tasks rather than simply prompting it to analyze code blocks. Grinstead described the harness as "the code that drives the LLM in order to accomplish a goal," providing Anthropic Mythos with access to the same tools Mozilla developers use, including specialized Firefox builds for testing

1

.

The harness enables deterministic verification by running test cases against Firefox's sanitizer build. When the AI model identifies a potential memory safety issue and crafts an exploit, the system automatically validates whether it triggers a crash. A second AI model then grades the output, ensuring high-quality reports before human engineers review them

1

. This multi-layered approach filters out the "unwanted slop" that plagued earlier attempts at using AI for software security.

Mozilla deployed the harness across numerous ephemeral virtual machines, targeting specific files in the codebase. This distributed approach allowed comprehensive scanning while maintaining the verification rigor needed to avoid false positives

4

.

Sandbox Vulnerabilities and Complex Attack Chains Uncovered

Among the 271 Firefox security vulnerabilities discovered, many involved sandbox escape exploits—particularly difficult bugs to detect through conventional methods. Mozilla's bug bounty program pays up to $20,000 for sandbox vulnerabilities, the highest reward available, yet Grinstead noted that Mythos finds more sandbox issues than human researchers ever did

2

.

The AI model demonstrated combinatorial reasoning capabilities that traditional fuzzing cannot match. One example involved a 20-year-old heap use-after-free bug triggered through the XSLTProcessor DOM API without user interaction

3

. Another case required understanding how three independent Firefox behaviors collide to create a vulnerability—each behavior appearing innocent individually but becoming exploitable when combined

4

.

Out of the 271 bugs, 180 received sec-high severity ratings and 80 were classified as sec-moderate

4

. Mozilla unhid full Bugzilla reports for 12 vulnerabilities, including detailed test cases that meet the same criteria required for all Firefox security vulnerabilities

1

.

Skepticism Surrounds Attribution and Methodology

Despite Mozilla's transparency efforts, cybersecurity experts have raised questions about whether Anthropic Mythos itself deserves credit or if the custom harness makes the difference. Davi Ottenheimer, president of security consultancy flyingpenguin, conducted tests showing that lesser models like Sonnet 4.6 and Haiku 4.5, when paired with a harness called Wirken, identified eight findings in two minutes at approximately $0.75—two of which matched bugs Mythos had found

3

.

Ottenheimer criticized Mozilla for not providing transparent comparisons between Mythos and other models, particularly since Mozilla acknowledged that Opus 4.6 was already identifying "an impressive amount of previously unknown vulnerabilities"

3

. The skepticism centers on whether Mozilla proved that only Mythos could achieve these results, or if comparable outcomes might be possible with less expensive models.

Mozilla has not obtained CVE designations for the 271 vulnerabilities, following its standard practice of not seeking CVE listings for internally discovered security bugs

1

. This decision initially fueled criticism but aligns with how many developers handle internal findings.

Implications for Defenders and Attackers

While AI coding tools have made documented progress, Firefox engineers still write and review every patch manually. Though they ask AI to generate patch code, the output typically cannot be deployed directly and serves only as a model for human engineers

2

. Grinstead confirmed that "every single one is one engineer writing a patch and one engineer reviewing it" and noted they "have not found it to be automatable"

2

.

The broader question remains whether these capabilities shift the balance in cybersecurity toward defenders or attackers. Anthropic CEO Dario Amodei suggested that "if we handle this right, we could be in a better position than we started, because we fixed all these bugs"

2

. However, Grinstead offered a more measured perspective: "It's useful for both attackers and defenders, but having the tool available shifts the advantage a little bit to defense. Realistically, nobody knows the answer to this yet"

2

.

Mozilla plans to integrate automated scanning into its continuous integration pipeline, analyzing code changes immediately for potential vulnerabilities

4

. As organizations across the software industry watch Mozilla's experiment, the pressure builds to act on similar findings before adversaries exploit the same AI techniques.

Today's Top Stories

TheOutpost.ai

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Instagram logo
LinkedIn logo
Youtube logo
© 2026 TheOutpost.AI All rights reserved