Curated by THEOUTPOST
On Fri, 7 Feb, 4:04 PM UTC
5 Sources
[1]
Meta and UNESCO team up to improve translation AI
They're working to integrate speeches and text into AI speech and translation models. Meta has partnered with UNESCO on a new plan to improve translation and speech recognition AI, Techcrunch reported. As part of its Language Technology Partner Program, Meta is seeking collaborators willing to donate at least 10 hours of speech recordings with transcriptions, large written texts (200-plus sentences) and sets of translated sentences. The aim is to focus on "underserved languages, in support of UNESCO's work," Meta wrote in a blog post. So far, Meta and UNESCO have signed on the government of Nunavut, a northern Canadian territory. The aim is to develop translation systems for the Intuit languages used there, Inuktitut and Inuinnaqtun. "Our efforts are especially focused on underserved languages, in support of UNESCO's work as part of the International Decade of Indigenous Languages," Meta said. As part of the program, Meta is releasing an open source translation benchmark called BOUQuET -- a standard test to evaluate performance of AI models that do translation. It'll be composed of sentences "carefully crafted by linguistic experts," and is seeking contributions on a dedicated site. Meta has taken a strong interest in AI translation for both text and speech, a logical move for a company that connects users around the world. Last year, it showed off a tool that uses AI to automatically dub Reels into other languages, complete with lip-sync, promising it would roll out to some creators videos in English and Spanish in the US first. The company has gradually expanded its Meta AI assistant around the world and it's now available in 43 countries and over a dozen languages.
[2]
Meta launches new program to improve speech and translation AI | TechCrunch
Meta is launching a new program in partnership with UNESCO to collect speech recordings and transcriptions the company said will help the development of future openly available AI. The program, the Language Technology Partner Program, is seeking collaborators who can contribute more than 10 hours of speech recordings with transcriptions, large amounts of written text, and sets of translated sentences in "diverse languages." According to Meta, partners will work with the company's AI teams to integrate these languages into AI speech recognition and translation models, which -- when finalized -- will be open-sourced. Partners so far include the government of Nunavut, a sparsely populated territory in Northern Canada. Some residents of Nunavut speak Intuit languages collectively known as Inuktut. "Our efforts are especially focused on underserved languages, in support of UNESCO's work," Meta wrote in a blog post provided to TechCrunch. "Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or cultural background." Complementary to the new program, Meta said that it's releasing an open source machine translation benchmark to evaluate the performance of language translation models. The benchmark, composed of sentences crafted by linguists, supports seven languages, and can be accessed -- and contributed to -- from the AI development platform Hugging Face. Meta is framing both initiatives as philanthropic. But the company stands to benefit from upgraded speech recognition and translation models. Meta continues to expand the number of languages its AI-powered assistant, Meta AI, supports, and pilot features such as automatic translation for creators. Last September, Meta announced that it would begin testing a tool to translate voices in Instagram Reels, allowing creators to dub their speech and auto-lip-sync it. Meta's treatment of content in languages other than English across its platforms has been the target of much criticism. According to one report, Facebook left almost 70% of Italian- and Spanish-language COVID misinformation unflagged compared to just 29% of similar English-language misinformation. And leaked documents from the company reveal that Arabic-language posts are regularly flagged erroneously as hate speech. Meta has said that it's taking steps to improve its translation and moderation technologies.
[3]
Meta Launches New Program to Improve AI Speech and Translation Models With UNESCO
Introduction of an Open-Source Translation Benchmark to evaluate AI translation models. Meta's Fundamental AI Research (FAIR) team has announced a new program aimed at enhancing and expanding machine translation and speech recognition, particularly for underserved languages. In collaboration with UNESCO, Meta is expanding its support for linguistic diversity through open-source AI models and research. Also Read: Meta Says Open-Source AI Is Transforming Healthcare Outcomes To achieve the same, Meta announced on Friday the launch of the new Language Technology Partner Programme, which aims to find partners to collaborate on advancing and expanding its open-source language technologies, including AI translation technologies. Meta is particularly focusing efforts on underserved languages, supporting UNESCO's work as part of the International Decade of Indigenous Languages. Partners will contribute speech recordings, transcriptions, and translated text to help improve AI-driven speech recognition and machine translation models. The Government of Nunavut, Canada, has already joined the initiative, providing data for the Inuit languages Inuktitut and Inuinnaqtun. Participants will also gain access to technical workshops led by Meta's researchers. "We are looking for partners who can contribute 10+ hours of speech recordings with transcriptions, large amounts of written text (200+ sentences) and sets of translated sentences in diverse languages," Meta said on February 7, 2025. The company added that partners will work with its teams to help integrate these languages into AI-driven speech recognition and machine translation models, which, when released, will be open source and freely available to the community. Also Read: Meta Plans to Invest up to USD 65 Billion in AI in 2025 Additionally, Meta is launching an Open Source Translation Benchmark -- a standardised test that Meta says will help evaluate the performance of AI models that conduct translation. Designed by linguistic experts, the benchmark assesses machine translation models. The benchmark is available in seven languages, and contributes translations that will be made open source and available to others, Meta said. Meta said this new announcement is part of its long-term commitment to supporting under-served languages. In 2022, Meta released the No Language Left Behind (NLLB) project, an open-source machine translation engine that, according to the company, was the first neural machine translation model for many languages and laid the foundation for future research and development. Also Read: Meta Expands Access to Llama AI Models for US Government Use More recently, Meta introduced the Meta Massively Multilingual Speech (MMS) project, which scaled speech recognition to over 1,100 languages. In 2024, the project added new capabilities, including zero-shot transcription, allowing AI to transcribe languages it has never encountered before without prior training. "Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or cultural background," Meta said in a blog post.
[4]
Meta Partners With UNESCO on Indigenous Language Translation
Meta is partnering with world heritage organization UNESCO in a move that could lead to lesser-known Indigenous languages being incorporated into Meta AI, TechCrunch reports. The Language Technology Partner Program is currently looking for contributors who can provide more than 10 hours of speech recordings with transcriptions. Contributors will also need to provide pre-translated sentences and a significant body of written work in the target language, which will then be used to build Meta's AI systems. The government of Nunavut, a territory in northern Canada that speaks a variety of endangered Native Inuit languages, has already signed up for the program as a partner, according to TechCrunch. Meta's contributions to the translation and transcription space might not be as universally well-known as Google's omnipresent Google Translate, but it's an area where the company is pouring a lot of attention at the moment. In January, the tech giant's AI research division previewed a new AI model that could reportedly translate speech from 101 different languages in a scientific paper, which it presented as a key step toward a widely available speech-to-speech translation model Applications to join the Language Technology Partner Program are set to stay open until March 7, 2025, and the next steps will be discussed no later than April 15, 2025. "Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or cultural background," said a Meta spokesperson. "As we continue in this direction, we're excited to collaboratively enhance and expand machine translation and other language technologies." Plenty of other leading translation apps have been turning their resources towards endangered languages; Google added the Inuit language Inuktut in October 2024, as well as Quechuan, a widely spoken Indigenous South American language spoken mainly in Peru in 2022. Meta's investment in AI remains high across the board. Last month, CEO Mark Zuckerberg announced Meta planned to end 2025 with "more than 1.3 million GPUs," which would represent a doubling of its current GPU capacity, with the goal of powering edge AI assistants in the company's upcoming Llama 4 model. The upcoming model plans to compete with high-profile rivals such as OpenAI's ChatGPT and Google Gemini.
[5]
Advancing machine intelligence through human-centered research
To join our Language Technology Partner Program, please fill out this interest form. In addition to our Language Technology Partner Program, we're launching an open source machine translation benchmark of carefully crafted sentences by linguistics experts to showcase the diversity of human language. We invite you to access the benchmark in seven languages and contribute translations that will be open sourced for others. We aim to collectively build an unprecedented multilingual machine translation benchmark for the community. Learn more Our commitment to supporting more languages and developing open source technologies for them is ongoing. In 2022, we released No Language Left Behind (NLLB), a groundbreaking open source machine translation engine that laid the foundation for future research and development in this area. As the first neural machine translation model for many languages, NLLB paved the way for further innovation. Since its release, the open source community has built upon this work, expanding its capabilities to support dozens of additional languages. We're also pleased that UNESCO and Hugging Face collaborated with us to build a language translator based on NLLB, which we announced during the United Nations General Assembly week last September. As we continue to develop this technology, we're excited to collaborate with language communities to enhance and expand machine translation and other language technologies. To support digital empowerment, which is a key thematic area of the Global Action Plan of the International Decade of Indigenous Languages, we recently introduced the Massively Multilingual Speech (MMS) project, which scales audio transcription to over 1,100 languages. We've continued to improve and expand its capabilities, including the addition of zero-shot speech recognition, which enables the model to transcribe audio in languages it's never seen before without prior training. These technologies have significant implications for language support and accessibility, particularly for under-served communities. By promoting the implementation of the International Decade of Indigenous Languages, we're aiming to address the challenges posed by the proliferation of English language models and working towards equal representation for all languages, contributing to the achievement of the United Nations' Sustainable Development Goals. In addition to its potential impact on language support and accessibility, our work also has broader implications for the development of AMI. By working on multilingual problems and under-served languages, the model demonstrates the ability to learn from minimal data. These developments mark a crucial step towards creating intelligent systems that can adapt to new situations and learn from experience. Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or cultural background, and build technology that is inclusive of our world's languages and cultures.
Share
Share
Copy Link
Meta collaborates with UNESCO to improve AI translation and speech recognition, focusing on underserved and indigenous languages. The initiative includes a new Language Technology Partner Program and an open-source translation benchmark.
Meta, in partnership with UNESCO, has launched a groundbreaking program aimed at improving AI-driven translation and speech recognition technologies, with a particular focus on underserved and indigenous languages 12. This initiative, known as the Language Technology Partner Program, seeks to address the linguistic diversity challenges in AI development and support UNESCO's work during the International Decade of Indigenous Languages 3.
The program is actively seeking collaborators who can contribute substantial language resources, including:
These contributions will be integrated into AI speech recognition and translation models, which will ultimately be open-sourced and made freely available to the community 3.
The government of Nunavut, a northern Canadian territory, has already joined the initiative as a partner. Their participation aims to develop translation systems for Inuit languages, specifically Inuktitut and Inuinnaqtun 14. This collaboration underscores Meta's commitment to supporting underserved languages and aligns with UNESCO's efforts in preserving and promoting indigenous languages.
Complementing the Language Technology Partner Program, Meta is introducing an open-source translation benchmark called BOUQuET 1. This standardized test, composed of sentences crafted by linguistic experts, will evaluate the performance of AI translation models 2. Currently supporting seven languages, the benchmark is accessible on the AI development platform Hugging Face, encouraging community contributions and fostering collaborative improvement of translation technologies 25.
This new initiative builds upon Meta's previous efforts in the field of language technology:
Meta's investment in multilingual AI has significant implications for language support, accessibility, and the development of adaptive AI systems 5. By focusing on underserved languages, the company aims to create more inclusive technology that can understand and respond to complex human needs across diverse linguistic and cultural backgrounds 25.
The Language Technology Partner Program is accepting applications until March 7, 2025, with next steps to be discussed by April 15, 2025 4. As Meta continues to expand its AI capabilities, including plans to significantly increase its GPU capacity by the end of 2025, these language initiatives are poised to play a crucial role in the company's broader AI strategy 4.
While Meta frames these initiatives as philanthropic, it's worth noting that the company stands to benefit from improved speech recognition and translation models across its platforms 2. Additionally, Meta has faced criticism for its handling of content in non-English languages, with reports highlighting discrepancies in content moderation and misinformation flagging across different languages 2.
As Meta advances its language technology efforts, the tech giant will need to address these concerns and demonstrate a commitment to equitable language support across its diverse user base.
Reference
[4]
Meta unveils SEAMLESSM4T, an advanced AI model capable of translating speech and text across multiple languages, bringing us closer to the concept of a universal translator.
4 Sources
4 Sources
Meta, led by Mark Zuckerberg, introduces a groundbreaking AI translation tool for Instagram and Facebook Reels. This technology promises to revolutionize content creation and consumption across language barriers.
3 Sources
3 Sources
Meta has released a range of new AI models and tools, including SAM 2.1, Spirit LM, and Movie Gen, focusing on open-source development and collaboration with filmmakers to drive innovation in various fields.
2 Sources
2 Sources
Meta has released Llama 3, its latest and most advanced AI language model, boasting significant improvements in language processing and mathematical capabilities. This update positions Meta as a strong contender in the AI race, with potential impacts on various industries and startups.
22 Sources
22 Sources
Meta has launched Spirit LM, an open-source multimodal language model that seamlessly integrates speech and text, offering more expressive and natural-sounding AI-generated speech. This development challenges existing AI voice systems and competes with models from OpenAI and others.
4 Sources
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved