2 Sources
2 Sources
[1]
Ai2 releases Olmo 3 open models, rivaling Meta, DeepSeek and others on performance and efficiency
The Allen Institute for AI (Ai2) released a new generation of its flagship large language models, designed to compete more squarely with industry and academic heavyweights. The Seattle-based nonprofit unveiled Olmo 3, a collection of open language models that it says outperforms fully open models such as Stanford's Marin and commercial open-weight models like Meta's Llama 3.1. Earlier versions of Olmo were framed mainly as scientific tools for understanding how AI models are built. With Olmo 3, Ai2 is expanding its focus, positioning the models as powerful, efficient, and transparent systems suitable for real-world use, including commercial applications. "Olmo 3 proves that openness and performance can advance together," said Ali Farhadi, the Ai2 CEO, in a press release Thursday morning announcing the new models. It's part of a broader evolution in the AI world. Over the past year, increasingly powerful open models from companies and universities -- including Meta, DeepSeek, Qwen, and Stanford -- have started to rival the performance of proprietary systems from big tech companies. Many of the latest open models are designed to show their reasoning step-by-step -- commonly called "thinking" models -- which has become a key benchmark in the field. Ai2 is releasing Olmo 3 in multiple versions: Olmo 3 Base (the core foundation model); Olmo 3 Instruct (tuned to follow user directions); Olmo 3 Think (designed to show more explicit reasoning); and Olmo 3 RL Zero (an experimental model trained with reinforcement learning). Open models have been gaining traction with startups and businesses that want more control over costs and data, along with clearer visibility into how the technology works. Ai2 is going further by releasing the full "model flow" behind Olmo 3 -- a set of snapshots showing how the model progressed through each stage of training. In addition, an updated OlmoTrace tool will let researchers link a model's reasoning steps back to the specific data and training decisions that influenced them. In terms of energy and cost efficiency, Ai2 says the new Olmo base model is 2.5 times more efficient to train than Meta's Llama 3.1 (based on GPU-hours per token, comparing Olmo 3 Base to Meta's 8B post-trained model). Much of this gain comes from training Olmo 3 on far fewer tokens than comparable systems, in some cases six times fewer than rival models. Among other improvements, Ai2 says Olmo 3 can read or analyze much longer documents at once, with support for inputs up to 65,000 tokens, about the length of a short book chapter. Founded in 2014 by the late Microsoft co-founder Paul Allen, Ai2 has long operated as a research-focused nonprofit, developing open-source tools and models while bigger commercial labs dominated the spotlight. The institute has made a series of moves this year to elevate its profile while preserving its mission of developing AI to solve the world's biggest problems. In August, Ai2 was selected by the National Science Foundation and Nvidia for a landmark $152 million initiative to build fully open multimodal AI models for scientific research, positioning the institute to serve as a key contributor to the nation's AI backbone. It also serves as the key technical partner for the Cancer AI Alliance, helping Fred Hutch and other top U.S. cancer centers train AI models on clinical data without exposing patient records. Olmo 3 is available now on Hugging Face and Ai2's model playground.
[2]
Ai2's Olmo 3 family challenges Qwen and Llama with efficient, open reasoning and customization
The Allen Institute for AI (Ai2) hopes to take advantage of an increased demand for customized models and enterprises seeking more transparency from AI models with its latest release. Ai2 made the latest addition to its Olmo family of large language models available to organizations, continuing to focus on openness and customization. Olmo 3 has a longer context window, more reasoning traces and is better at coding than its previous iteration. This latest version, like the other Olmo releases, is open-sourced under the Apache 2.0 license. Enterprises will have complete transparency into and control over the training data and checkpointing. Ai2 will release three versions of Olmo 3: * Olmo 3- Think in both 7B and 32B are considered the flagship reasoning models for advanced research * Olmo 3- Base also in both parameters, which is ideal for programming, comprehension, math and long-context reasoning. Ai2 said this version is "ideal for continued pre-training or fine-tuning * Olmo 3-Instruct in 7B that is optimized for instruction following, multi-turn dialogue and tool use The company said Olmo 3- Think is the "first-ever fully open 32B thinking model that generates explicit reasoning-chain-style content." Olmo-3 Think also has a long context window of 65,000 tokens, perfect for longer-running agentic projects or reasoning over longer documents. Noah Smith, Ai2's senior director of NLP research, told VentureBeat in an interview that many of its customers, from regulated enterprises to research institutions, want to use models that give them assurance about what went into the training. "The releases from our friends in the tech world are very cool and super exciting, but there are a lot of people for whom data privacy control over what goes into the model, how the models train and other constraints on how the model can be used as front of mind," said Smith. Developers can access the models on Hugging Face and the Ai2 Playground. Transparency and customization Smith said models like Olmo 3, which the company believes any organization using its models has to have control over and mold in the way that best works for them. "We don't believe in one-size-fits-all solutions," Smith said. It's a known thing in the world of machine learning that if you try and build a model that solves all the problems, it ends up not being really the best model for any one problem. There aren't formal proofs of that, but it's a thing that old timers like me have kind of observed." He added that models with the ability to specialize "are maybe not as flash as getting high scores on math exams" but offer more flexibility for enterprises. Olmo 3 allows enterprises to essentially retrain the model by adding to the data mix it learns from. The idea is that businesses can bring in their proprietary sources to guide the model in answering specific company queries. To help enterprises during this process, Ai2 added checkpoints from every major training phase. Demand for model customization has grown as enterprises that cannot build their own LLMs want to create company-specific or industry-focused models. Startups like Arcee have begun offering enterprise-focused, customizable small models. Models like Olmo 3, Smith said, also give enterprises more confidence in the technology. Since Olmo 3 provides the training data, Smith said enterprises can trust that the model did not ingest anything it shouldn't have. Ai2 has always claimed to be committed to greater transparency, even launching a tool called OlmoTrace in April that can track a model's output directly back to the original training data. The company releases open-sourced models and posts its code to repositories like GitHub for anyone to use. Competitors like Google and OpenAI have faced criticism from developers over moves that hid raw reasoning tokens and chose to summarize reasoning, claiming that they now resort to "debugging blind" without transparency. Ai2 pretrained Olmo 3 on the six-trillion-token OpenAI dataset, Dolma 3. The dataset encompasses web data, scientific literature and code. Smith said they optimized Olmo 3 for code, compared to the focus on math for Olmo 2. How it stacks up Ai2 claims that the Olmo 3 family of models represents a significant leap for truly open-source models, at least for open-source LLMs developed outside China. The base Olmo 3 model trained "with roughly 2.5x greater compute efficiency as measured by GPU-hours per token," meaning it consumed less energy during pre-training and costs less. The company said the Olmo 3 models outperformed other open models, such as Marin from Stanford, LLM360's K2, and Apertus, though Ai2 did not provide figures for the benchmark testing. "Of note, Olmo 3-Think (32B) is the strongest fully open reasoning model, narrowing the gap to the best open-weight models of similar scale, such as the Qwen 3-32B-Thinking series of models across our suite of reasoning benchmarks, all while being trained on 6x fewer tokens," Ai2 said in a press release. The company added that Olmo 3-Instruct performed better than Qwen 2.5, Gemma 3 and Llama 3.1.
Share
Share
Copy Link
The Allen Institute for AI releases Olmo 3, a new generation of open-source language models designed to rival proprietary systems. The models feature enhanced reasoning capabilities, transparency, and energy efficiency while maintaining full openness for commercial use.
The Allen Institute for AI (Ai2) has released Olmo 3, a new generation of open-source large language models positioned to directly compete with industry leaders like Meta, DeepSeek, and other commercial AI systems.
1
The Seattle-based nonprofit's latest offering represents a significant evolution from earlier Olmo versions, which were primarily designed as scientific research tools, to powerful systems suitable for real-world commercial applications.
Source: VentureBeat
"Olmo 3 proves that openness and performance can advance together," said Ali Farhadi, Ai2's CEO, highlighting the organization's commitment to maintaining transparency while delivering competitive performance.
1
This release comes at a time when increasingly powerful open models from various organizations have begun rivaling proprietary systems from major tech companies.Ai2 is releasing Olmo 3 in four distinct versions, each optimized for specific applications. The Olmo 3 Base serves as the core foundation model, while Olmo 3 Instruct is fine-tuned to follow user directions and handle multi-turn dialogue.
2
The flagship Olmo 3-Think model, available in both 7B and 32B parameter versions, represents what the company calls "the first-ever fully open 32B thinking model that generates explicit reasoning-chain-style content."2
The models feature significantly enhanced capabilities, including support for context windows up to 65,000 tokensβroughly equivalent to a short book chapter.
1
This extended context length makes the models particularly suitable for analyzing longer documents and supporting more complex reasoning tasks that require maintaining context over extended conversations or document analysis.What sets Olmo 3 apart from many competitors is its commitment to complete transparency. Ai2 is releasing the full "model flow" behind Olmo 3, providing snapshots that show how the model progressed through each stage of training.
1
The updated OlmoTrace tool allows researchers to link a model's reasoning steps directly back to the specific data and training decisions that influenced them.Noah Smith, Ai2's senior director of NLP research, emphasized the importance of this transparency for enterprise customers, particularly those in regulated industries. "There are a lot of people for whom data privacy control over what goes into the model, how the models train and other constraints on how the model can be used as front of mind," Smith explained.
2
The models are released under the Apache 2.0 license, giving organizations complete control over training data and checkpointing processes.Related Stories
Ai2 claims significant efficiency improvements with Olmo 3, reporting that the base model is 2.5 times more efficient to train than Meta's Llama 3.1, measured by GPU-hours per token.
1
This efficiency gain stems from training Olmo 3 on significantly fewer tokens than comparable systemsβin some cases, six times fewer than rival models. The model was pretrained on the six-trillion-token Dolma 3 dataset, which encompasses web data, scientific literature, and code.2
According to Ai2, Olmo 3 models outperformed other open models including Stanford's Marin, LLM360's K2, and Apertus across various benchmarks. The Olmo 3-Think 32B model particularly stands out as "the strongest fully open reasoning model," narrowing the performance gap with leading open-weight models like the Qwen 3-32B-Thinking series.
2
Summarized by
Navi
[1]