Curated by THEOUTPOST
On Wed, 27 Nov, 4:02 PM UTC
3 Sources
[1]
OLMo 2: Ai2's second Open Language Model
This is an open source AI according to Open Source Initiative's definition, meaning that the tools and data used to develop it are available to the public. Ai2 claimed that it has observed a considerable improvement in performance compared to their own previous OLMo 0424 model and that OLMo 2 7B outperforms Meta's Llama-3.Ai2, the non-profit AI research organization founded by late Microsoft cofounder Paul Allen, released the second family of models in its OLMo series (Open Language Model), named OLMo 2, on Tuesday. OLMo 2 meets the Open Source Initiative's definition of open source AI, which means that the tools and data used to develop it are publicly available. Open Source Initiative is an organisation that aims to define and "steward" all things open source, according to TechCrunch. Ai2's website claims that since the release of the first OLMo in February 2024, they have seen "rapid growth in the open language model ecosystem, and a narrowing of the performance gap between open and proprietary models." According to Ai2's blog, "OLMo 2 is a family of fully-open language models, developed start-to-finish with open and accessible training data, open-source training code, reproducible training recipes, transparent evaluations, intermediate checkpoints, and more. By openly sharing our data, recipes, and findings, we hope to provide the open-source community with the resources needed to discover new and innovative approaches to improve model pretraining." The OLMo 2 family has two models: one with 7 billion parameters (OLMo 7B) and another with 13 billion parameters (OLMo 13B). According to TechCrunch, parameters could roughly be defined as a model's problem-solving skills, and models with more parameters generally perform better than those with fewer. OLMo 2's language models, similar to others, can execute text-based tasks, like answering questions, summarizing documents, and writing code. Ai2 used a data set of 5 trillion tokens in order to train the models. TechCrunch says that tokens represent bits of raw data -- 1 million tokens is equal to about 750,000 words. The training set included websites "filtered for high quality," academic papers, Q&A discussion boards, and math workbooks "both synthetic and human generated." In its blog, Ai2 said, "Not only do we observe a dramatic improvement in performance across all tasks compared to our earlier OLMo 0424 model but, notably, OLMo 2 7B outperforms (Meta's) Llama-3". The OLMo 2 models and all of their components can be downloaded from Ai2's website. They can be used commercially since it's under the Apache 2.0 license.
[2]
Ai2 releases new language models competitive with Meta's Llama | TechCrunch
There's a new AI model family on the block, and it's one of the few that can be reproduced from scratch. On Tuesday, Ai2, the nonprofit AI research organization founded by the late Paul Allen, released OLMo 2, the second family of models in its OLMo series. (OLMo's short for "Open Language Model.") While there's no shortage of "open" language models to choose from (see: Meta's Llama), OLMo 2 meets the Open Source Initiative's definition of open source AI, meaning the tools and data used to develop it are publicly available. The Open Source Initiative, the long-running institution aiming to define and "steward" all things open source, finalized its open source AI definition in October. But the first OLMo models, released in February, met the criterion as well. "OLMo 2 [was] developed start-to-finish with open and accessible training data, open-source training code, reproducible training recipes, transparent evaluations, intermediate checkpoints, and more," AI2 wrote in a blog post. "By openly sharing our data, recipes, and findings, we hope to provide the open-source community with the resources needed to discover new and innovative approaches." There's two models in the OLMo 2 family: one with 7 billion parameters (OLMo 7B) and one with 13 billion parameters (OLMo 13B). Parameters roughly correspond to a model's problem-solving skills, and models with more parameters generally perform better than those with fewer parameters. Like most language models, OLMo 2 7B and 13B can perform a range of text-based tasks, like answering questions, summarizing documents, and writing code. To train the models, Ai2 used a data set of 5 trillion tokens. Tokens represent bits of raw data; 1 million tokens is equal to about 750,000 words. The training set included websites "filtered for high quality," academic papers, Q&A discussion boards, and math workbooks "both synthetic and human generated." Ai2 claims the result is models that are competitive, performance-wise, with open models like Meta's Llama 3.1 release. "Not only do we observe a dramatic improvement in performance across all tasks compared to our earlier OLMo model but, notably, OLMo 2 7B outperforms LLama 3.1 8B," Ai2 writes. "OLMo 2 [represents] the best fully-open language models to date." The OLMo 2 models and all of their components can be downloaded from Ai2's website. They're under Apache 2.0 license, meaning they can be used commercially. There's been some debate recently over the safety of open models, what with Llama models reportedly being used by Chinese researchers to develop defense tools. When I asked Ai2 engineer Dirk Groeneveld in February whether he was concerned about OLMo being abused, he told me that he believes the benefits ultimately outweigh the harms. "Yes, it's possible open models may be used inappropriately or for unintended purposes," he said. "[However, this] approach also promotes technical advancements that lead to more ethical models; is a prerequisite for verification and reproducibility, as these can only be achieved with access to the full stack; and reduces a growing concentration of power, creating more equitable access."
[3]
Ai2's New Language Models Compete Directly with Qwen and Llama
The OLMo 2 models were trained using up to 5 trillion tokens, which has enabled them to achieve high performance in multiple natural language processing tasks. The Allen Institute for AI (Ai2), founded by late Microsoft founder Paul Allen, has released a new family of language models, OLMo 2, to improve natural language understanding and generation. These models were recently updated on Hugging Face. The models, available in both 7 billion (OLMo-2-1124-7B) and 13 billion (OLMo-2-1124-13B) parameter versions, are trained on a wide range of datasets to help improve the performance of AI systems in various applications, from scientific research to customer service. Ai2 announced the development on X saying, "Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens." Ai2 has also released instruction-tuned versions of the OLMo 2 models. These 'instruct' models utilise the Tülu 3 dataset to perform better on tasks that require understanding and following specific instructions. OLMo-2-1124-7B-Instruct and OLMo-2-1124-13B-Instruct are optimised to handle more structured and goal-oriented tasks. These models have shown significant improvements in benchmarks such as MATH, GSM8K, and IFEval, displaying the ability to tackle complex reasoning tasks and respond more effectively to inputs. One of the key features of the OLMo 2 release is its emphasis on open access. Ai2 has made the models and training data publicly available, including all code and intermediate checkpoints, under the Apache 2.0 license. This openness is in line with Ai2's commitment to fostering transparency and reproducibility in AI research. By providing access to the full model architecture, training processes, and evaluation results, Ai2 claims to enable other researchers to build upon their work and contribute to further advancements in the field of language modelling. The OLMo 2 models were trained using up to 5 trillion tokens, a vast amount of text data which has enabled them to achieve high performance in multiple natural language processing tasks. These models compete directly with leading frontier models such as Qwen 2.5, Llama 3.1, Mistral NeMo and Gemma 2. In terms of performance, OLMo 2 has achieved notable results across various benchmarks. For instance, the OLMo 2 7B Instruct model has demonstrated strong performance on tasks like GSM8K and MATH, indicating its proficiency in mathematical reasoning. Similarly, the OLMo 2 13B Instruct model has shown competitive results on benchmarks like GSM8K and IFEval, demonstrating its ability to handle diverse tasks. As reported by AIM in the past, Ai2's Macaw also competed directly with OpenAI's GPT-3. Most of the performance metrics and benchmarks now suggest that OLMo 2 is at par with these frontier models. Its open-weight availability and comprehensive training approach position it as a strong contender in the landscape of advanced language models.
Share
Share
Copy Link
The Allen Institute for AI (Ai2) has unveiled OLMo 2, a family of open-source language models that compete with leading AI models while adhering to open-source principles, potentially reshaping the landscape of accessible AI technology.
The Allen Institute for AI (Ai2), a non-profit research organization founded by the late Microsoft co-founder Paul Allen, has released OLMo 2, the latest iteration of its Open Language Model series. This release marks a significant advancement in open-source AI technology, potentially rivaling proprietary models in performance while maintaining full transparency and accessibility 1.
OLMo 2 comes in two variants: a 7 billion parameter model (OLMo 7B) and a 13 billion parameter model (OLMo 13B). These models were trained on an extensive dataset of 5 trillion tokens, encompassing high-quality websites, academic papers, Q&A discussion boards, and both synthetic and human-generated math workbooks 2.
Ai2 claims that OLMo 2 demonstrates substantial improvements over its predecessor, OLMo 0424, and even outperforms Meta's Llama-3 in certain aspects. The models are capable of executing various text-based tasks, including answering questions, summarizing documents, and writing code 1.
What sets OLMo 2 apart is its adherence to the Open Source Initiative's definition of open-source AI. This means that all tools, data, and processes used in its development are publicly available. Ai2 has released not only the models but also the training data, code, intermediate checkpoints, and evaluation results under the Apache 2.0 license, allowing for commercial use 2.
Ai2 has also introduced instruction-tuned versions of OLMo 2, optimized for structured and goal-oriented tasks. These variants, OLMo-2-1124-7B-Instruct and OLMo-2-1124-13B-Instruct, have shown impressive results on benchmarks such as MATH, GSM8K, and IFEval, demonstrating proficiency in complex reasoning tasks 3.
OLMo 2 enters a competitive field, challenging established models like Qwen 2.5, Llama 3.1, Mistral NeMo, and Gemma 2. Its performance metrics suggest that it is on par with these frontier models, while its open-weight availability provides a unique advantage 3.
The release of OLMo 2 reignites discussions about the safety of open-source AI models. While concerns exist about potential misuse, Ai2 engineer Dirk Groeneveld argues that the benefits of open models outweigh the risks. He emphasizes that this approach promotes technical advancements, enables verification and reproducibility, and creates more equitable access to AI technology 2.
Reference
[1]
[3]
AI2 introduces Molmo, a free and open-source AI model that outperforms GPT-4 and Claude on certain benchmarks. This development could potentially reshape the AI landscape and democratize access to advanced language models.
3 Sources
3 Sources
Researchers at the Allen Institute for AI have developed Molmo, an open-source multimodal AI model that rivals proprietary models in performance while being significantly smaller and more efficient.
3 Sources
3 Sources
Meta has released Llama 3, its latest and most advanced AI language model, boasting significant improvements in language processing and mathematical capabilities. This update positions Meta as a strong contender in the AI race, with potential impacts on various industries and startups.
22 Sources
22 Sources
NVIDIA has released an open-source large language model with 72 billion parameters, positioning it as a potential competitor to OpenAI's GPT-4. This move marks a significant shift in NVIDIA's AI strategy and could reshape the AI landscape.
3 Sources
3 Sources
Mistral AI, a French startup, has released Large 2, an open-source AI model that rivals offerings from tech giants like OpenAI, Meta, and Anthropic. The model demonstrates exceptional performance in coding and mathematics tasks, potentially reshaping the AI landscape.
6 Sources
6 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved