Ai2 releases open-source web agent MolmoWeb to challenge OpenAI, Google, and Anthropic

2 Sources

Share

The Allen Institute for AI unveiled MolmoWeb, an open-source web agent built on the Molmo 2 multimodal model that navigates browsers using screenshots rather than code. Available in 4B and 8B parameter versions, the 8B model outperformed GPT-4o on key web navigation tasks, offering developers and researchers a community-driven alternative to closed systems from major tech companies.

Ai2 Launches MolmoWeb as Open-Source Web Agent Alternative

The Allen Institute for AI (Ai2) has released MolmoWeb, an open-source web agent designed to rival closed systems from OpenAI, Google, and Anthropic

1

. This visual AI agent can take control of web browsers and automate tasks by interpreting screenshots of webpages the way a person would, rather than relying on underlying page code

2

. Built on Ai2's Molmo 2 multimodal model family, MolmoWeb arrives at a critical moment when major tech companies are racing to build AI agents capable of navigating computers and the web on behalf of users

1

.

Source: GeekWire

Source: GeekWire

The Seattle-based nonprofit institute released MolmoWeb in two sizes: 4B and 8B parameters, making it available for free along with weights, training data, code, and evaluation tools

2

. Developers can access the agent through Hugging Face and GitHub, along with a demo for testing on supported websites

1

. This community-driven alternative lets researchers and developers look under the hood to understand what's happening in ways not possible with closed systems

1

.

How MolmoWeb Works to Automate Tasks

MolmoWeb operates by observing web pages through a series of screenshots and then interacting directly via the interface by predicting what will happen when it takes actions such as clicking, typing characters into text fields, or scrolling up and down

2

. The web agent supports navigating URLs, clicking on screen coordinates, typing text into fields, scrolling through pages, opening and switching browser tabs, and sending messages back to users

2

.

This visual approach offers distinct advantages. Ai2 designed the agent this way so it won't break if the underlying webpage code or HTML changes on the fly, as some web pages obfuscate how they operate to protect themselves or use specialized JavaScript engines to detect bots and stop ad blockers

2

. Using underlying code can also consume tens of thousands of tokens, the essential currency of AI operations

2

. Visual interfaces behave more closely to how humans interact with web interfaces, making it easier to debug why the model did what it did

2

.

MolmoWeb Outperforms GPT-4o on Benchmarks

Despite its compact size, MolmoWeb achieves state-of-the-art results among open-weight web agents. When tested on popular evaluation suites, the 8B model scored 78.2% on WebVoyager, 42.3% on DeepShop, and 49.5% on TailBench, outperforming leading open-weight models such as Fara-7B across all four benchmarks

2

. Ai2 reports that the 8B version outperformed agents built on much larger proprietary models including GPT-4o on key web navigation tasks

1

.

Ai2 highlighted that MolmoWeb can outperform agents built on GPT-4 that rely on annotated and structured page data, a particularly important result given that those models can "see" deeply into the very code of the webpage and have substantially larger parameter sizes

2

. Unlike other open-weight web agents, MolmoWeb was trained without compressing a proprietary vision-based agent, with data coming from synthetically generated text-only accessibility agents and human usage of actual web browsing activities

2

.

Building on Olmo's Open Foundation for LLMs

"In many ways, web agents today are where LLMs were before Olmo -- the community needs an open foundation to build on," Ai2 stated in a blog post, referring to its open large language model project that has served as a counterpoint to closed models from OpenAI and others

1

. This release provides more access to open-weight browser AI agents, helping researchers and hobbyists develop their own web automations

2

.

The release comes during a transition period for Ai2, with CEO Ali Farhadi and key researchers departing for Microsoft, where they are joining Mustafa Suleyman's Superintelligence team

1

. Ai2's primary funder is shifting its focus away from model training toward real-world applications of AI, though all of Ai2's programs for 2026 are fully funded

1

. Anthropic recently acquired Seattle-based startup Vercept, founded by Ai2 veterans, which was building similar screen-understanding agentic technology for Macs and PCs

1

. Closed-source providers including Perplexity AI have already entered the market with agentic web browsers capable of automating web tasks

2

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo