Curated by THEOUTPOST
On Fri, 25 Oct, 8:02 AM UTC
3 Sources
[1]
Cohere's multilingual models, Denmark's supercomputer, and legal AI agents: This week in new AI launches
Cohere For AI, the research arm of enterprise AI platform Cohere, announced its family of multilingual models, Aya Expanse, this week. The "highly performant" model family performs across 23 languages, and was released as both an 8 billion parameters model and a 32 billion parameters model on Kaggle and Hugging Face. Parameters are the variables a model learns from training data that guide its ability to make predictions. The smaller model "makes breakthroughs more accessible to researchers worldwide," while Cohere's larger model "offers state-of-the-art multilingual capabilities," the company said. The multilingual models outperformed open-weight models from Google (GOOGL), Mistral, and Meta (META), according to Cohere.
[2]
Cohere launches new AI models to bridge global language divide
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Cohere today released two new open-weight models in its Aya project to close the language gap in foundation models. Aya Expanse 8B and 35B, now available on Hugging Face, expands performance advancements in 23 languages. Cohere said in a blog post the 8B parameter model "makes breakthroughs more accessible to researchers worldwide," while the 32B parameter model provides state-of-the-art multilingual capabilities. The Aya project seeks to expand access to foundation models in more global languages than English. Cohere for AI, the company's research arm, launched the Aya initiative last year. In February, it released the Aya 101 large language model (LLM), a 13-billion-parameter model covering 101 languages. Cohere for AI also released the Aya dataset to help expand access to other languages for model training. Aya Expanse uses much of the same recipe used to build Aya 101. "The improvements in Aya Expanse are the result of a sustained focus on expanding how AI serves languages around the world by rethinking the core building blocks of machine learning breakthroughs," Cohere said. "Our research agenda for the last few years has included a dedicated focus on bridging the language gap, with several breakthroughs that were critical to the current recipe: data arbitrage, preference training for general performance and safety, and finally model merging." Aya performs well Cohere said the two Aya Expanse models consistently outperformed similar-sized AI models from Google, Mistral and Meta. Aya Expanse 32B did better in benchmark multilingual tests than Gemma 2 27B, Mistral 8x22B and even the much larger Llama 3.1 70B. The smaller 8B also performed better than Gemma 2 9B, Llama 3.1 8B and Ministral 8B. Cohere developed the Aya models using a data sampling method called data arbitrage as a means to avoid the generation of gibberish that happens when models rely on synthetic data. Many models use synthetic data created from a "teacher" model for training purposes. However, due to the difficulty in finding good teacher models for other languages, especially for low-resource languages. It also focused on guiding the models toward "global preferences" and accounting for different cultural and linguistic perspectives. Cohere said it figured out a way to improve performance and safety even while guiding the models' preferences. "We think of it as the 'final sparkle' in training an AI model," the company said. "However, preference training and safety measures often overfit to harms prevalent in Western-centric datasets. Problematically, these safety protocols frequently fail to extend to multilingual settings. Our work is one of the first that extends preference training to a massively multilingual setting, accounting for different cultural and linguistic perspectives." Models in different languages The Aya initiative focuses on ensuring research around LLMs that perform well in languages other than English. Many LLMs eventually become available in other languages, especially for widely spoken languages, but there is difficulty in finding data to train models with the different languages. English, after all, tends to be the official language of governments, finance, internet conversations and business, so it's far easier to find data in English. It can also be difficult to accurately benchmark the performance of models in different languages because of the quality of translations. Other developers have released their own language datasets to further research into non-English LLMs. OpenAI, for example, made its Multilingual Massive Multitask Language Understanding Dataset on Hugging Face last month. The dataset aims to help better test LLM performance across 14 languages, including Arabic, German, Swahili and Bengali. Cohere has been busy these last few weeks. This week, the company added image search capabilities to Embed 3, its enterprise embedding product used in retrieval augmented generation (RAG) systems. It also enhanced fine-tuning for its Command R 08-2024 model this month.
[3]
Cohere announces Aya Expanse multilingual AI model family for researchers - SiliconANGLE
Cohere announces Aya Expanse multilingual AI model family for researchers Cohere for AI, the nonprofit research lab run by the artificial intelligence startup Cohere Inc., pushed the boundaries of multilingual frontier AI model research today with the release of Aya Expanse, a family of high-performance multilingual large language models that it says outperform other leading open rivals. The new family includes two new models in 8 billion and 32 billion parameters released with open weights on hosting sites Kaggle and Hugging Face. The models cover 23 languages including English, Arabic, Chinese, Czech, Dutch, French, German, Greek and Hindu. "Aya Expanse marks an important step to expand high-quality coverage of languages in LLMs," said the Cohere research team. "Since we first launched the Aya initiative two years ago, we have collaborated with over 3,000 researchers from 119 countries to expand cutting-edge multilingual research." The Aya Initiative is a goal by Cohere to advance state-of-the-art multilingual AI to bridge the gap between people across the world using technology and expand the number of languages covered by AI. It involves building the Aya collection, the largest multilingual dataset collection to date, which includes 513 million examples, and Aya-101, an AI model capable of covering more than 100 languages. The team said that it used several new core research innovations in Aya Expanse that gave it superior performance. These included the use of synthetic data, human feedback in late-term training and model merging. To train Aya Expanse, the company said the lab turned to synthetic data for languages with limited data sets. This is not an uncommon practice in the AI industry, using data generated by "teacher" models for training. However, there is a problem where large language models can suffer from model collapse or produce "gibberish" when trained on synthetic data. To avoid this, the company used data arbitrage, where it used teacher models that had specialized skills in particular multilingual language skills. Near the late stage of model training, the company said, it began using feedback from human teachers to guide the model toward high-quality outputs. Many multilingual models tend to be biased toward Western cultures and settings, mostly thanks to the countries of origin of their datasets and the companies that build them. "Our work is one of the first that extends preference training to a massively multilingual setting, accounting for different cultural and linguistic perspectives," the company said. "We find this leads to large gains both in general performance and safety." Finally, to increase performance, Cohere combines the model weights of multiple fine-tuned candidates at each stage in an attempt to create a single model. According to a study written on the subject, merging can sometimes bring improvements up to 8% and 10% in general performance and safety respectively. The company said these innovations brought Aya Expanse 8B to achieve a 60.4% simulated win rate in multilingual performance against Google LLC's Gemma 2 9B LLM in m-ArenaHard benchmarks. The larger model, Aya Expanse 32B, outperforms Gemma 2 72B and Mistral 8x22B at 51.8% and 76.6%, respectively. It also outperformed Meta Platforms Inc.'s Llama-3.1 70B, a model twice its size, in pair-wise win rates at 54%. In addition to releasing the open weights for Aya Expanse 8B and 32B, Cohere said the company is continuing to collaborate on wider multilingual AI research to broaden access to linguistic data, software and compute resources.
Share
Share
Copy Link
Cohere's research arm releases Aya Expanse, a family of multilingual AI models that outperform leading open-source alternatives, aiming to bridge the global language divide in AI technology.
Cohere For AI, the research division of enterprise AI platform Cohere, has unveiled its latest innovation in multilingual artificial intelligence: the Aya Expanse family of models. This release marks a significant advancement in bridging the global language divide within AI technology 1.
The Aya Expanse family includes two open-weight models:
Both models are now accessible on popular AI platforms Kaggle and Hugging Face, catering to researchers and developers worldwide 2.
Aya Expanse boasts impressive multilingual capabilities:
Cohere employed several cutting-edge techniques in developing Aya Expanse:
The Aya initiative, launched two years ago, aims to advance multilingual AI research and bridge language gaps. Key achievements include:
The release of Aya Expanse represents a significant step towards more inclusive AI technology. By addressing the challenges of data scarcity in non-English languages and incorporating diverse cultural perspectives, Cohere is paving the way for more equitable AI development.
As the AI industry continues to grapple with language barriers, initiatives like Aya Expanse could play a crucial role in democratizing access to advanced AI capabilities across linguistic and cultural boundaries 1 2 3.
Reference
Cohere's non-profit research division has released Aya Vision, a state-of-the-art open-source AI model capable of analyzing images and generating text in 23 languages, outperforming larger models in efficiency and multilingual capabilities.
3 Sources
3 Sources
Cohere introduces Command R7B, the smallest model in its R series, designed for enterprise use with a focus on efficiency, performance, and versatility across multiple languages and tasks.
2 Sources
2 Sources
Canadian AI startup Cohere announces a strategic shift towards developing tailored AI models for enterprise users, moving away from the race to build larger foundation models.
3 Sources
3 Sources
Cohere, a prominent AI startup, recently secured $500 million in funding, reaching a $5.5 billion valuation. However, the company subsequently laid off 20% of its workforce, signaling a strategic realignment in the competitive AI landscape.
13 Sources
13 Sources
Cohere launches Embed 3, an advanced multimodal AI model that integrates text and image embeddings, setting new standards for enterprise search and multilingual retrieval tasks.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved