Curated by THEOUTPOST
On Fri, 30 Aug, 4:03 PM UTC
2 Sources
[1]
Google DeepMind's Morni AI to cover 125 Indic languages
Google DeepMind's India unit is working on an Indic language AI project called Morni (Multimodal Representation for India) which aims to cover 125 Indian languages and dialects to build inclusive and equitable Indic AI. "So India has 22 scheduled languages, which are viewed as official languages. But in our work, we are targeting over 100 Indian languages, because we find that there are 60 Indian languages which have over a billion speakers and over 125 languages that have over a lakh speakers each," said Manish Gupta, Director of Google DeepMind, Google India. He was speaking at the Global Fintech Fest in Mumbai on Thursday. He explained that 73 out of these 125 languages had zero corpus of digital data available. Even for a languages like Hindi, which is now spoken by close to 10% of the world's population, the share of Hindi text on the internet is 0.1%. Google's research lab overcame the challenge of sourcing data for these languages by launching the project Vaani, a collaboration between Google, the Indian Institute of Science (IISc), and ARTPARK (Artificial Intelligence & Robotics Technology Park). The project has completed its first phase to create an open-source database of over 14,000 hours of speech data across 58 languages, collected from 80,000 speakers in 80 districts, Gupta said. First announced in December 2022, Project Vaani aims to collect and transcribe 154,000 hours of open-source anonymised speech data from all 773 districts of India. Gupta said they are now in the middle of phase two that will cover all states in India spanning 160 districts.
[2]
Google DeepMind's Morni AI to cover 125 Indic languages
Speaking at the Global Fintech Fest, Manish Gupta, Director of Google DeepMind, Google India, said 73 out of these 125 languages had zero corpus of digital data available.Google DeepMind's India unit is working on an Indic language AI project called Morni (Multimodal Representation for India) which aims to cover 125 Indian languages and dialects to build inclusive and equitable Indic AI. "So India has 22 scheduled languages, which are viewed as official languages. But in our work, we are targeting over 100 Indian languages, because we find that there are 60 Indian languages which have over a billion speakers and over 125 languages that have over a lakh speakers each," said Manish Gupta, Director of Google DeepMind, Google India. He was speaking at the Global Fintech Fest in Mumbai on Thursday. He explained that 73 out of these 125 languages had zero corpus of digital data available. Even for a languages like Hindi, which is now spoken by close to 10% of the world's population, the share of Hindi text on the internet is 0.1%. Google's research lab overcame the challenge of sourcing data for these languages by launching the project Vaani, a collaboration between Google, the Indian Institute of Science (IISc), and ARTPARK (Artificial Intelligence & Robotics Technology Park). The project has completed its first phase to create an open-source database of over 14,000 hours of speech data across 58 languages, collected from 80,000 speakers in 80 districts, Gupta said. First announced in December 2022, Project Vaani aims to collect and transcribe 154,000 hours of open-source anonymised speech data from all 773 districts of India. Gupta said they are now in the middle of phase two that will cover all states in India spanning 160 districts.
Share
Share
Copy Link
Google DeepMind introduces Morni AI, a groundbreaking language model designed to support 125 Indic languages. This innovation aims to enhance digital inclusion and preserve linguistic diversity in India.
In a significant leap towards linguistic inclusivity, Google DeepMind has introduced Morni AI, a cutting-edge language model designed to support an impressive array of 125 Indic languages 1. This groundbreaking development marks a pivotal moment in the realm of artificial intelligence and language processing, particularly for the diverse linguistic landscape of India.
Morni AI's primary objective is to bridge the digital divide by providing support for languages that have historically been underrepresented in the digital sphere. By encompassing such a wide range of Indic languages, the model aims to make digital content and services more accessible to millions of users across India who primarily communicate in regional languages 2.
The development of Morni AI leverages advanced machine learning techniques and vast datasets to create a model capable of understanding and generating content in multiple Indic languages. This technological feat represents a significant advancement in natural language processing, particularly for languages with limited digital resources 1.
One of the key aspects of Morni AI is its potential to contribute to the preservation and promotion of India's rich linguistic heritage. By providing robust support for numerous regional languages, the model could play a crucial role in maintaining linguistic diversity in the digital age, ensuring that even less commonly spoken languages have a place in the evolving digital landscape 2.
The introduction of Morni AI is expected to have far-reaching implications for India's technology ecosystem. It could potentially catalyze the development of more localized digital services, applications, and content, fostering innovation and entrepreneurship in the Indian tech sector 1.
While Morni AI represents a significant breakthrough, it also faces challenges such as ensuring accuracy across all 125 languages and addressing potential biases in language processing. The future development and refinement of this technology will be crucial in realizing its full potential for digital inclusion and linguistic preservation in India 2.
Reference
[1]
[2]
Google announces plans to expand its artificial intelligence initiatives in India, focusing on overcoming language barriers and improving agricultural efficiency. The tech giant aims to leverage AI to address key challenges in the country's diverse linguistic landscape and agricultural sector.
2 Sources
2 Sources
Google introduces the Agricultural Landscape Understanding (ALU) tool, an AI-based solution designed to provide crucial agricultural information to Indian farmers, including drought preparedness and irrigation insights.
2 Sources
2 Sources
Sarvam AI, an Indian startup, has introduced Sarvam-1, a large language model optimized for 10 Indian languages and English. This 2-billion-parameter model outperforms larger competitors and addresses key challenges in processing Indic languages.
5 Sources
5 Sources
AI4Bharat is collecting 10 trillion tokens of language data from across India to develop AI models that can effectively understand and process Indian languages, aiming to bridge the gap in AI accessibility for the country's linguistically diverse population.
2 Sources
2 Sources
Indian startups are rapidly adopting cheap AI voice bots, aiming to reach a billion people. This trend is transforming the tech landscape and business operations across the country.
4 Sources
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved