Curated by THEOUTPOST
On Sat, 26 Oct, 4:01 PM UTC
10 Sources
[1]
Meta using over 100,000 NVIDIA H100 AI GPUs for Llama 4, Zuck: 'bigger than anything I've seen'
TLDR: Mark Zuckerberg announced that Meta is working on its Llama 4 model, expected to launch later this year, using a massive AI GPU cluster with over 100,000 NVIDIA H100 GPUs. This setup reportedly cost over $2 billion.* Based on the content by Anthony Garreffa below. Mark Zuckerberg has provided a small update on Meta's work on its new Llama 4 model, which is being trained on a cluster of AI GPUs "bigger than anything" Zuck has seen. Meta is cooking its new Llama 4 right now, with Zuckerberg telling investors and analysts on an earnings call this week that the initial launch of Llama 4 is expected later this year. Zuck said: "We're training the Llama 4 models on a cluster that is bigger than 100,000 H100s, or bigger than anything that I've seen reported for what others are doing. I expect that the smaller Llama 4 models will be ready first". Meta's new AI supercomputer with its 100,000+ NVIDIA H100 AI GPUs reportedly cost over $2 billion for the H100 AI GPU chips alone, which means Mark Zuckerberg is signing some fat cheques to NVIDIA. Speaking of which, NVIDIA CEO Jensen Huang recently spoke with Meta CEO Mark Zuckerberg, where Jensen said Meta now has 600,000+ NVIDIA H100 AI GPUs to which Zuck replied saying that Meta were "good customers for NVIDIA". Meta is in a race with xAI and Elon Musk, with Musk's xAI to double the size of its Colossus AI supercomputer cluster, which has 100,000 NVIDIA Hopper AI GPUs and is getting upgraded to 200,000 NVIDIA Hopper AI GPUs. But the fight of the biggest AI supercluster is damn interesting to watch, just don't show me the power bills.
[2]
Meta is using more than 100,000 Nvidia H100 AI GPUs to train Llama-4 -- Mark Zuckerberg says that Llama 4 is being trained on a cluster "bigger than anything that I've seen"
Llama 4 slated to have new modalities, stronger reasoning, and faster performance Mark Zuckerberg said on a Meta earnings call earlier this week that the company is training Llama 4 models "on a cluster that is bigger than 100,000 H100 AI GPUs, or bigger than anything that I've seen reported for what others are doing." While the Facebook founder didn't give any details on what Llama 4 could do, Wired quoted Zuckerberg referring to Llama 4 as having "new modalities," "stronger reasoning," and "much faster." This is a crucial development as Meta competes against other tech giants like Microsoft, Google, and Musk's xAI to develop the next generation of AI LLMs. Meta isn't the first company to have an AI training cluster with 100,000 Nvidia H100 GPUs. Elon Musk fired up a similarly sized cluster in late July, calling it a 'Gigafactory of Compute' with plans to double its size to 200,000 AI GPUs. However, Meta stated earlier this year that it expects to have over half a million H100-equivalent AI GPUs by the end of 2024, so it likely already has a significant number of AI GPUs running for training Llama 4. Meta's Llama 4 is taking a unique approach to developing AI, as it releases its Llama models entirely for free, allowing other researchers, companies, and organizations to build upon it. This differs from other models like OpenAI's GPT-4o and Google's Gemini, which are only accessible via an API. However, the company still places limitations on Llama's license, like restricting its commercial use and not offering any information on how it was trained. Nevertheless, its "open source" nature could help it dominate the future of AI -- we've seen this with Chinese AI models built off open-source code that could match GPT-4o and Llama-3 in benchmark tests. All this computing power results in a massive power demand, especially as a single modern AI GPU could use up to 3.7MWh of power annually. That means a 100,000 AI GPU cluster would use at least 370GWh annually -- enough to power over 34 million average American households. This raises concerns about how these companies could find such massive supplies, especially as bringing new power sources online takes time. After all, even Zuckerberg himself said that power constraints will limit AI growth. For example, Elon Musk used several large mobile power generators to power his 100,000-strong compute in Memphis. Google has been slipping behind its carbon targets, increasing its greenhouse gas emissions by 48% since 2019. Even the former Google CEO suggested we should drop our climate goals, let AI companies go full tilt, and then use the AI technologies we've developed to solve the climate crisis. However, Meta executives dodged the question when an analyst asked them how the company was able to power such a massive computing cluster. On the other hand, Meta's AI competitors, like Microsoft, Google, Oracle, and Amazon, are jumping on the nuclear bandwagon. They're either investing in small modular reactors or restarting old nuclear plants to ensure they will have enough electricity to power their future developments. While these will take time to develop and deploy, giving AI data centers their small nuclear plants would help reduce the burden of these power-hungry clusters on the national power grid.
[3]
Meta's Next Llama AI Models Are Training on a GPU Cluster 'Bigger Than Anything' Else
The race for better generative AI is also a race for more computing power. On that score, according to CEO Mark Zuckerberg, Meta appears to be winning. Meta CEO Mark Zuckerberg laid down the newest marker in generative AI training on Wednesday, saying that the next major release of the company's Llama model is being trained on a cluster of GPUs that's "bigger than anything" else that's been reported. Llama 4 development is well under way, Zuckerberg told investors and analysts on an earnings call, with an initial launch expected early next year. "We're training the Llama 4 models on a cluster that is bigger than 100,000 H100s, or bigger than anything that I've seen reported for what others are doing," Zuckerberg said, referring to the Nvidia chips popular for training AI systems. "I expect that the smaller Llama 4 models will be ready first." Increasing the scale of AI training with more computing power and data is widely believed to be key to developing significantly more capable AI models. While Meta appears to have the lead now, most of the big players in the field are likely working towards using compute clusters with more than 100,000 advanced chips. In March, Meta and Nvidia shared details about clusters of about 25,000 H100s that were used to develop Llama 3. In July, Elon Musk touted his xAI venture having worked with X and Nvidia to set up 100,000 H100s. "It's the most powerful AI training cluster in the world!" he wrote on X at the time. On Wednesday, Zuckerberg declined to offer details on Llama 4's potential advanced capabilities but vaguely referred to "new modalities," "stronger reasoning," and "much faster." Meta's approach to AI is proving a wild card in the corporate race for dominance. Llama models can be downloaded in their entirety for free in contrast to the models developed by OpenAI, Google, and most other major companies, which can only be accessed through an API. Llama has proven hugely popular with startups and researchers looking to have complete control over their models, data, and compute costs. Although touted as "open source" by Meta, the Llama license does impose some restrictions on the model's commercial use. Meta also does not disclose details of the models training, which limits outsiders' ability to probe how it works. The company released the first version of Llama in July of 2023 and made the latest version, Llama 3.2, available this September.
[4]
Mark Zuckerberg Confirms Llama 4 Release Early Next Year
"We're training the Llama 4 models on a cluster that is bigger than 100k H100s or bigger than anything that I've seen reported," said Mark Zuckerberg on the earnings call. Meta's chief Mark Zuckerberg recently confirmed AIM's prediction of the company releasing Llama 4 early next year at its third quarterly earnings call. "I expect that the smaller Llama 4 models will be ready first, and we expect [them] sometime early next year, and I think that they're going to be a big deal on several fronts -- new modalities, capabilities, stronger reasoning, and much faster," said Zuckerberg. Meta announced its Q3 2024 results on October 30, where the company posted strong operational and financial results. Total revenue surged 19% to $40.59 with expenses climbing 14% to $23.24 billion. As per Zuckerberg, the Llama 3 models marked a major turning point in the industry, but he expressed even greater enthusiasm for Llama 4, which he confirmed is well into development. "We're training the Llama 4 models on a cluster that is bigger than 100k H100s or bigger than anything that I've seen reported for what others are doing," he said. At the recent Meta's Build with AI Summit, in Bengaluru, Meta's VP of product, Ragavan Srinivasan, hinted at releasing "next gen" Llama models by 2025, with native integrations, extended memory and context capabilities, cross-modality support, and expanded third-party collaborations, while advancing memory-based applications for coding and leveraging deep hardware partnerships. Zuckerberg also mentioned that the latest Llama 3.2 model, known for its small, on-device capabilities and multimodal functionality, saw extensive adoption, not only by enterprises but also by the public sector, particularly in the U.S. government. Meta's growth was partly driven by increased advertiser demand for Meta's AI-driven advertising solutions, which leverage generative AI tools that have shown promising results. Over one million advertisers utilised Meta's generative AI tools last month, creating more than 15 million ads. Businesses that implemented AI-driven image generation reported a remarkable 7% increase in conversions. "Improvements to our AI-driven feed and video recommendations have led to an 8% increase in time spent on Facebook and a 6% increase on Instagram this year alone," Zuckerberg noted, showcasing the tangible impact of AI on user engagement. Over one million advertisers leveraged Meta's generative AI tools for ad creation, which led to a 7% boost in conversion rates for businesses using image generation features. "We had a good quarter driven by AI progress across our apps and business," said Zuckerberg. "We also have strong momentum with Meta AI, Llama adoption, and AI-powered glasses."As Meta continues to refine its AI offerings and enhance its infrastructure, Zuckerberg emphasised the company's intent to explore new business opportunities driven by these advancements. "There are a lot of new opportunities to use new AI advances to accelerate our core business that should have strong ROI over the next few years," he stated.
[5]
Mark Zuckerberg Confirms Llama 4 Release in Early 2025
"We're training the Llama 4 models on a cluster that is bigger than 100k H100s or bigger than anything that I've seen reported," said Mark Zuckerberg on the earnings call. Meta's chief Mark Zuckerberg recently confirmed AIM's prediction of the company releasing Llama 4 early next year at its third quarterly earnings call. "I expect that the smaller Llama 4 models will be ready first, and we expect [them] sometime early next year, and I think that they're going to be a big deal on several fronts -- new modalities, capabilities, stronger reasoning, and much faster," said Zuckerberg. Meta announced its Q3 2024 results on October 30, where the company posted strong operational and financial results. Total revenue surged 19% to $40.59 with expenses climbing 14% to $23.24 billion. As per Zuckerberg, the Llama 3 models marked a major turning point in the industry, but he expressed even greater enthusiasm for Llama 4, which he confirmed is well into development. "We're training the Llama 4 models on a cluster that is bigger than 100k H100s or bigger than anything that I've seen reported for what others are doing," he said. At the recent Meta's Build with AI Summit, in Bengaluru, Meta's VP of product, Ragavan Srinivasan, hinted at releasing "next gen" Llama models by 2025, with native integrations, extended memory and context capabilities, cross-modality support, and expanded third-party collaborations, while advancing memory-based applications for coding and leveraging deep hardware partnerships. Zuckerberg also mentioned that the latest Llama 3.2 model, known for its small, on-device capabilities and multimodal functionality, saw extensive adoption, not only by enterprises but also by the public sector, particularly in the U.S. government. Meta's growth was partly driven by increased advertiser demand for Meta's AI-driven advertising solutions, which leverage generative AI tools that have shown promising results. Over one million advertisers utilised Meta's generative AI tools last month, creating more than 15 million ads. Businesses that implemented AI-driven image generation reported a remarkable 7% increase in conversions. "Improvements to our AI-driven feed and video recommendations have led to an 8% increase in time spent on Facebook and a 6% increase on Instagram this year alone," Zuckerberg noted, showcasing the tangible impact of AI on user engagement. Over one million advertisers leveraged Meta's generative AI tools for ad creation, which led to a 7% boost in conversion rates for businesses using image generation features. "We had a good quarter driven by AI progress across our apps and business," said Zuckerberg. "We also have strong momentum with Meta AI, Llama adoption, and AI-powered glasses."As Meta continues to refine its AI offerings and enhance its infrastructure, Zuckerberg emphasised the company's intent to explore new business opportunities driven by these advancements. "There are a lot of new opportunities to use new AI advances to accelerate our core business that should have strong ROI over the next few years," he stated.
[6]
Meta Likely to Release Llama 4 Early Next Year, Pushing Towards Autonomous Machine Intelligence (AMI)
AI models are getting better at reasoning -- of sorts. OpenAI's o1, for instance, has levelled up enough to earn a cautious nod from Apple. Meanwhile, Kai-Fu Lee's 01.AI is also making waves with Yi-Lightning, claiming to outpace GPT-4o on reasoning benchmarks. With China's models catching up fast, Meta is also stepping up Llama's game. The big question: can Meta bring Llama's reasoning closer to the likes of GPT-4o and o1? Manohar Paluri, VP of AI at Meta told AIM that the team is exploring ways for Llama models to not only "plan" but also evaluate decisions in real time and adjust when conditions change. This iterative approach, using techniques like 'Chain of Thought,' supports Meta's vision of achieving "autonomous machine intelligence" that can effectively combine perception, reasoning, and planning. Meta AI chief Yann LeCun believes that autonomous machine intelligence or AMI, also known as "friend" in French, systems can truly help people in their daily lives. This, according to him, involves developing systems that can understand cause and effect, and model the physical world. This might also be an alternative term for AGI or ASI, which OpenAI is so obsessed with achieving-or most likely has already achieved internally by now. That explains why Sam Altman recently debunked the rumours of Orion (GPT-5) being released this December, labelling them as "fake news out of control," waiting for Google, Meta and others to catch up. (check out the video below, which was released by OpenAI five years ago) AGI or AMI talks aside, Paluri further highlighted that reasoning in AI, particularly in "non-verifiable domains", requires breaking down complex tasks into manageable steps, which allows the model to dynamically adapt. For example, planning a trip involves not only booking a flight but also handling real-time constraints like weather changes, which may mean rerouting to alternative transportation. "The fundamental learning aspect here is the ability to know that I'm on the right track and to backtrack if needed. That's where future Llama versions will excel in complex, real-world problem solving," he added. Recently, Meta unveiled Dualformer, a model that dynamically switches between fast, intuitive thinking and slow, deliberate reasoning, mirroring human cognitive processes and enabling efficient problem-solving across tasks like maze navigation and complex maths. What's Llama's Secret Sauce? Meta said that it leverages self-supervised learning (SSL) during its training to help Llama learn broad representations of data across domains, which allows for flexibility in general knowledge. RLHF (reinforcement learning with human feedback), which currently powers GPT-4o and majority of other models today, however, focuses on refining behaviour for specific tasks, ensuring that the model not only understands data but aligns with practical applications. Meta is combining models that are both versatile and task-oriented. Manohar said that SSL builds a foundational understanding from raw data, while RLHF aligns the model with human-defined goals by providing specific feedback after tasks. "Self-supervised learning enables models to pick up general knowledge from vast data autonomously. In contrast, RLHF is about task-specific alignment; it's like telling the model 'good job' or 'try again' as it learns to perform specific actions." Enables High Quality Synthetic Data Generation That explains how Llama has become a preferred choice for synthetic data generation. Llama 3.1 405B, in particular, is trained to generate data that supports language-specific nuances, improving model effectiveness in regions where data scarcity is a barrier. By focusing on Indic language datasets, Meta is looking to strengthen its model's ability to cater to multilingual environments, boosting accessibility and functionality for speakers of these languages. Case in point: These models empower startups like Sarvam AI and others and scale across Meta's platforms -- WhatsApp, Instagram, Facebook, and Threads -- for impactful, region-specific AI solutions. Speaking at India's biggest AI summit, Cypher 2024, Vivek Raghavan, the chief of Sarvam AI, revealed that they used Llama 3.1 405B to build Sarvam 2B. He explained that it is a 2 billion parameter model with 4 trillion tokens, of which 2 trillion are Indian language tokens. "If you look at the 100 billion tokens in Indian languages, we used a clever method to create synthetic data for building these models using Llama 3.1 405B. We trained the model on 1,024 NVIDIA H100s in India, and it took only 15 days," said Raghavan. "So, one important thing to consider is when you think about the flagship models like 405B, they're very expensive for inference," shared Paluri, saying that these models possess the capability to generate high-quality synthetic data, particularly for underserved languages like those in the Indic family, which is often challenging to source in traditional datasets. "Synthetic data generated by Llama 3.1 405B is instrumental in building diverse language resources, making it feasible to support languages like Hindi, Tamil, and Telugu, which are often underserved in standard datasets," said Paluri. Llama 4, when? Meta CEO Mark Zuckerberg, in a recent interview with AI influencer Rowan Cheung, said the company has already started pre-training for Llama 4. Zuckerberg added that Meta has set up compute clusters and data infrastructure for Llama 4, which he expects to be a major advancement over Llama 3. Meta's VP of product, Ragavan Srinivasan, at Meta's Build with AI Summit, hinted at releasing "Next gen" Llama models by 2025, with native integrations, extended memory and context capabilities, cross-modality support, and expanded third-party collaborations, while advancing memory-based applications for coding and leveraging deep hardware partnerships. Paluri joked that if you asked Zuckerberg about the timeline, he'd probably say it would be released "today", highlighting his enthusiasm and push for rapid progress in AI development. Citing Llama 3 release in April, 3.1 in July, and 3.2 in September, he outlined the rapid iteration of Llama model releases, highlighting that the team strives to release new versions every few months to continually improve AI capabilities. "We want to maintain a continuous momentum of improvements in each generation, so developers can expect predictable, significant upgrades with every release," said Paluri, hinting at 'next gen' Llama to be released -- potentially around early to mid-2025 if Meta continues this cadence of frequent updates. Quantisation of LLMs: Meta recently introduced quantized versions of its Llama 3.2 models, enhancing on-device AI performance with up to four times faster inference speeds, a 56% reduction in model size, and a 41% decrease in memory usage.
[7]
Meta Likely to Release Llama 4 Early Next Year, Pushing Towards Advanced Machine Intelligence (AMI)
AI models are getting better at reasoning -- of sorts. OpenAI's o1, for instance, has levelled up enough to earn a cautious nod from Apple. Meanwhile, Kai-Fu Lee's 01.AI is also making waves with Yi-Lightning, claiming to outpace GPT-4o on reasoning benchmarks. With China's models catching up fast, Meta is also stepping up Llama's game. The big question: can Meta bring Llama's reasoning closer to the likes of GPT-4o and o1? Manohar Paluri, VP of AI at Meta told AIM that the team is exploring ways for Llama models to not only "plan" but also evaluate decisions in real time and adjust when conditions change. This iterative approach, using techniques like 'Chain of Thought,' supports Meta's vision of achieving "advanced machine intelligence" that can effectively combine perception, reasoning, and planning. Meta AI chief Yann LeCun believes that advanced machine intelligence or AMI, also known as "friend" in French, systems can truly help people in their daily lives. This, according to him, involves developing systems that can understand cause and effect, and model the physical world. This might also be an alternative term for AGI or ASI, which OpenAI is so obsessed with achieving-or most likely has already achieved internally by now. That explains why Sam Altman recently debunked the rumours of Orion (GPT-5) being released this December, labelling them as "fake news out of control," waiting for Google, Meta and others to catch up. (check out the video below, which was released by OpenAI five years ago) AGI or AMI talks aside, Paluri further highlighted that reasoning in AI, particularly in "non-verifiable domains", requires breaking down complex tasks into manageable steps, which allows the model to dynamically adapt. For example, planning a trip involves not only booking a flight but also handling real-time constraints like weather changes, which may mean rerouting to alternative transportation. "The fundamental learning aspect here is the ability to know that I'm on the right track and to backtrack if needed. That's where future Llama versions will excel in complex, real-world problem solving," he added. Recently, Meta unveiled Dualformer, a model that dynamically switches between fast, intuitive thinking and slow, deliberate reasoning, mirroring human cognitive processes and enabling efficient problem-solving across tasks like maze navigation and complex maths. What's Llama's Secret Sauce? Meta said that it leverages self-supervised learning (SSL) during its training to help Llama learn broad representations of data across domains, which allows for flexibility in general knowledge. RLHF (reinforcement learning with human feedback), which currently powers GPT-4o and majority of other models today, however, focuses on refining behaviour for specific tasks, ensuring that the model not only understands data but aligns with practical applications. Meta is combining models that are both versatile and task-oriented. Manohar said that SSL builds a foundational understanding from raw data, while RLHF aligns the model with human-defined goals by providing specific feedback after tasks. "Self-supervised learning enables models to pick up general knowledge from vast data autonomously. In contrast, RLHF is about task-specific alignment; it's like telling the model 'good job' or 'try again' as it learns to perform specific actions." Enables High Quality Synthetic Data Generation That explains how Llama has become a preferred choice for synthetic data generation. Llama 3.1 405B, in particular, is trained to generate data that supports language-specific nuances, improving model effectiveness in regions where data scarcity is a barrier. By focusing on Indic language datasets, Meta is looking to strengthen its model's ability to cater to multilingual environments, boosting accessibility and functionality for speakers of these languages. Case in point: These models empower startups like Sarvam AI and others and scale across Meta's platforms -- WhatsApp, Instagram, Facebook, and Threads -- for impactful, region-specific AI solutions. Speaking at India's biggest AI summit, Cypher 2024, Vivek Raghavan, the chief of Sarvam AI, revealed that they used Llama 3.1 405B to build Sarvam 2B. He explained that it is a 2 billion parameter model with 4 trillion tokens, of which 2 trillion are Indian language tokens. "If you look at the 100 billion tokens in Indian languages, we used a clever method to create synthetic data for building these models using Llama 3.1 405B. We trained the model on 1,024 NVIDIA H100s in India, and it took only 15 days," said Raghavan. "So, one important thing to consider is when you think about the flagship models like 405B, they're very expensive for inference," shared Paluri, saying that these models possess the capability to generate high-quality synthetic data, particularly for underserved languages like those in the Indic family, which is often challenging to source in traditional datasets. "Synthetic data generated by Llama 3.1 405B is instrumental in building diverse language resources, making it feasible to support languages like Hindi, Tamil, and Telugu, which are often underserved in standard datasets," said Paluri. Llama 4, when? Meta CEO Mark Zuckerberg, in a recent interview with AI influencer Rowan Cheung, said the company has already started pre-training for Llama 4. Zuckerberg added that Meta has set up compute clusters and data infrastructure for Llama 4, which he expects to be a major advancement over Llama 3. Meta's VP of product, Ragavan Srinivasan, at Meta's Build with AI Summit, hinted at releasing "Next gen" Llama models by 2025, with native integrations, extended memory and context capabilities, cross-modality support, and expanded third-party collaborations, while advancing memory-based applications for coding and leveraging deep hardware partnerships. Paluri joked that if you asked Zuckerberg about the timeline, he'd probably say it would be released "today", highlighting his enthusiasm and push for rapid progress in AI development. Citing Llama 3 release in April, 3.1 in July, and 3.2 in September, he outlined the rapid iteration of Llama model releases, highlighting that the team strives to release new versions every few months to continually improve AI capabilities. "We want to maintain a continuous momentum of improvements in each generation, so developers can expect predictable, significant upgrades with every release," said Paluri, hinting at 'next gen' Llama to be released -- potentially around early to mid-2025 if Meta continues this cadence of frequent updates. Quantisation of LLMs: Meta recently introduced quantized versions of its Llama 3.2 models, enhancing on-device AI performance with up to four times faster inference speeds, a 56% reduction in model size, and a 41% decrease in memory usage.
[8]
How I Met Your Llama
Everybody loves Llama. Not just the Indian 'GPU-poor' developers, even India's richest billionaire, Mukesh Ambani, took the opportunity to laud Llama, and its impact in India. In a fireside chat with NVIDIA CEO Jensen Huang, Ambani said that Llama, the open-source AI model by tech giant Meta, can be used as a base to build state-of-the-art technologies in India. "This move of Mark will be written in history when we look at it a hundred years from now," he said, envisioning a 'Jio Moment' in AI for India. His vision is becoming a reality in India's tech ecosystem. "Like we did with data, a few years from now, we are going to surprise the world with what Indians can achieve in the intelligence market," Ambani said. At the Build With Meta Summit in Bengaluru, Manohar Paluri, VP at Meta, mentioned that Llama had 400 million downloads cumulatively since Llama 1's release in early 2023. India emerged as one of the top three markets globally. The ease of implementing, fine-tuning, and deploying an open-source model has found relevance in India to realise ideas that can change the lives of over a billion people. At the event, Meta announced partnerships with household consumer apps such as Flipkart, Meesho, Redbus, Dream11 and Infoedge. Meta, however, hasn't revealed how Llama was integrated into these apps, but generative AI has helped ship features that can provide a personalised user experience. One can expect enhancements along these lines on such apps. Flipkart mentioned that it uses Llama-Guard 28B, an LLM-based safeguarding tool that can assess the safety of user input towards an LLM. Flipkart has integrated Llama-Guard inside Aegis, its in-house safety layer, which prevents harmful and unwanted input responses from reaching an LLM. The tool is currently used for several chat-based experiences for sellers and buyers inside the platform. The e-commerce giant has also backed Tune AI (previously NimbleBox.ai), a startup that helps developers build, deploy and manage AI models without expertise and high-end infrastructure. TuneAI is built on multiple large language models, including Meta's Llama LLM. Moreover, apps in India are already using Llama in their tech stack without any explicit partnership with Meta. For example, Paluri mentioned that Ola is also using Llama. "We didn't know about it, so we're going to follow up and understand that. It's not just small companies or developers, but actually large companies that are actually building on top of this ecosystem," he shared. While we are yet to discover how Meta's partnerships with these consumer products evolve, they may indicate a significant shift in how generative AI can provide value to millions of Indians. India's existing generative AI ecosystem, developed with Llama, is indicative of this. If there's one thing that India thrives on, it's its rich culture with a diverse set of languages, and if there's one thing genAI is a champion at - that's language! Paluri also mentioned that Llama 3.1 is pre-trained in all prominent languages in India. That explains why we've seen a rise in synthetic data generation tools in Indic languages over the past few months. We've also seen Indic LLMs that aid specific sectors like agriculture, transportation and education. For example, KissanAI's Dhenu Llama 3 is designed to help farmers with their queries and understand both Hindi and English. In September of this year, Wadhwani AI, an independent nonprofit institute building AI solutions, was awarded a grant of $500,000 to realise their goal of improving the fluency and comprehension of English among students by using Llama 3's capabilities. Wadhwani has also been focusing on building AI solutions to solve crucial problems in agriculture and healthcare since 2018. According to the company, their projects have already reached the hands of 10 million people across 15 states of India. Even Indian Railways, which receives around 300,000 customer queries and complaints daily, has found a use case with Llama. They've partnered with Ubona, a speech recognition solutions provider that has used Llama, to build a custom LLM that supports multiple Indian languages in IVR calls. Besides, Llama is helping Indians curb fake news and perform fact checks. The Sach AI chatbot, built by Factly uses Llama 3.1. The company was selected as one of the recipients of Meta's Llama Impact Innovation Award. It helps users verify the legitimacy of news, and facts they come across. According to the World Economic Forum's 2024 Global Risks Report, India ranked the highest for the risk of misinformation and disinformation. Factly is solving a problem that may have severe consequences for the country's public harmony, mental health, and political stability. Llama's open-source advantage benefits more than just the regular consumer; the enterprise sector hasn't been left behind in this AI revolution. Some of the biggest names in India's IT industry have also turned towards Meta's open-source language models. PwC India has partnered with Meta to democratise generative AI and build a combination of consulting advisory and technological expertise to make their services more valuable for companies Infosys is also investing heavily in generative AI. Earlier, Yann LeCun revealed that one of Infosys' founders is funding Llama 2 to fine-tune it to comprehend 22 Indian languages. A few days ago, Infosys also announced a partnership with Meta, unveiling the 'Meta Centre of Excellence'. This centre focuses on large-scale adoption of the Llama models and helps enterprises, open source groups, and its internal teams integrate the open source model to build solutions. Also, Tata Technologies built an automotive design studio with Llama 2 and Stable Diffusion. Santosh Singh, executive VP at Tata Technologies, said, "The team uses generative AI to develop multiple design options on the fly. It helps reduce design time, engineering time, and product development time." And it isn't all about open source; Llama 3 is up there with some of the best foundational models in the market. While Llama 4 is bound to be released next year, Meta's Llama series of models is becoming the de facto standard in the country and around the globe -- touted by Zuckerberg as the Linux moment in AI. Meta also offers Llama in multiple variants based on their parameter size. This helps developers select an appropriate model for their use case. Moreover, the recent additions of multimodal technologies SpiritLM and MovieGen to the Llama models have raised many eyebrows. That's the beauty of Meta's open-source models -- choice, high performance and multimodal capabilities. "Who would've thought that Mark Zuckerberg would be the good guy (in AI) ?" George Hotz, founder of Comma.ai, remarked in a podcast episode with Lex Fridman.
[9]
Meta's Llama 3.1 405B is the Missing Link for Indic Datasets
Earlier this year, when Meta released the Llama 3.1 405B model, the updated version allowed developers to use outputs from Llama models -- including the 405B -- to improve other models. This benefitted developers and AI startups in India that were building Indic LLMs. Yann LeCun, Meta's chief AI scientist, acknowledged the past issues developers encountered with Llama regarding model creation. Speaking at Meta's Build with AI Summit in Bengaluru, LeCun said the company took note of it and addressed the concerns with its newer version. Sharing the stage with Yann LeCun in a fireside chat, Nandan Nilekani, influential Indian entrepreneur and co-founder of Infosys, said that this development will help Indian AI startups use LLMs easily and that it isn't necessary for India to build LLMs from scratch. "Our goal should not be to create another LLM. Let the big players in Silicon Valley handle that," Nilekani said. "India should become the use-case capital of the world and focus on building small models quickly." He further added India will use it [Llama] to create synthetic data, build small language models quickly, and train them using appropriate data. Nilekani is partially right as using OpenAI's GPT-40 or Anthropic's Claude Sonnet 3.5 APIs can be expensive. In contrast, Llama 3.1 405B is freely available on Hugging Face and competes well with top foundation models like GPT-4, GPT-4o, and Claude 3.5 Sonnet. Nilekani opined that the correct approach for Indian AI companies is to create appropriate data. Notably, in 2022, he invested in AI4Bharat, a research lab dedicated to creating open-source datasets, tools, models, and applications for Indian languages. Speaking at Cypher 2024, Vivek Raghavan, the chief of Sarvam AI, revealed that they used Llama 3.1 405B to build Sarvam 2B. He explained that it is a 2 billion parameter model with 4 trillion tokens, of which 2 trillion are Indian language tokens. Sarvam 2B is part of a class of small language models (SLMs) that includes Microsoft's Phi series, Llama 3 (8 billion), and Google's Gemma models. It serves as a viable alternative to using large models, such as those from OpenAI and Anthropic, while also being more efficient for specific use cases. "If you look at the 100 billion tokens in Indian languages, we used a clever method to create synthetic data for building these models using Llama 3.1 405B. We trained the model on 1,024 NVIDIA H100s in India, and it took only 15 days," said Raghavan. Regarding Sarvam 2B, he further said that the model performs well on Indic tasks. "It is extremely good for summarisation in Indian languages, and for any kind of NLP task in Indian languages -- this will outperform models that are much bigger." The company recently launched its latest model, Sarvam-1, which outperforms Google's Gemma-2 and Llama 3.2 on Indic tasks. The claims that its secret sauce is 2 trillion tokens of synthetic Indic data, equivalent to 6-8T regular tokens due to their super-efficient tokenizer. Sarvam AI is not alone. In a recent interaction with AIM, Meta VP Manohar Paluri revealed that even Ola Krutrim uses Llama. "Not just small companies or developers, even large companies build on top of this ecosystem, which really gives us confidence that we have the momentum." He further added that people are now using Llama as their de facto intelligence layer to build their businesses on top of it."We are actually trying to bring high-quality Indian tokens into Llama so that Llama will work in Indian languages," he said. Paluri explained that since Llama is the engine for Meta AI, the AI company will support Indian languages, benefiting billions of people in India who use Meta AI on WhatsApp, Facebook, and Instagram. Meta AI boasts over 500 million monthly active users globally and is on track to become the most widely used AI chatbot by the end of 2024. Recently, it was launched in Hindi and the company plans to integrate multiple Indian languages into their future models. In the Llama 3.1 research paper, Meta states that the model has been trained on multilingual data, generating high-quality instruction-tuning data for languages such as German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Meta has collected high-quality, manually annotated data from linguists and native speakers. Moreover, with a context length of 128K tokens, it can process and generate longer, more complex pieces of text, which is beneficial for creating diverse synthetic datasets. "Synthetic data generation is one of the main use cases for these very large models. This can be extremely helpful in domains where obtaining a large, high-quality dataset is challenging due to cost, privacy concerns, or simply a lack of available data," said Hamid Shojanazeri, ML engineer at Meta. Meta is not limited to LLMs. It intends to empower developers with the resources to create custom agents and discover new types of agentic behaviours.
[10]
Meta's Llama 3.1 is the Missing Link for Indic Datasets
Earlier this year, when Meta released the Llama 3.1 405B model, the updated version allowed developers to use outputs from Llama models -- including the 405B -- to improve other models. This benefitted developers and AI startups in India that were building Indic LLMs. Yann LeCun, Meta's chief AI scientist, acknowledged the past issues developers encountered with Llama regarding model creation. Speaking at Meta's Build with AI Summit in Bengaluru, LeCun said the company took note of it and addressed the concerns with its newer version. Sharing the stage with Yann LeCun in a fireside chat, Nandan Nilekani, influential Indian entrepreneur and co-founder of Infosys, said that this development will help Indian AI startups use LLMs easily and that it isn't necessary for India to build LLMs from scratch. "Our goal should not be to create another LLM. Let the big players in Silicon Valley handle that," Nilekani said. "India should become the use-case capital of the world and focus on building small models quickly." He further added India will use it [Llama] to create synthetic data, build small language models quickly, and train them using appropriate data. Nilekani is partially right as using OpenAI's GPT-40 or Anthropic's Claude Sonnet 3.5 APIs can be expensive. In contrast, Llama 3.1 405B is freely available on Hugging Face and competes well with top foundation models like GPT-4, GPT-4o, and Claude 3.5 Sonnet. Nilekani opined that the correct approach for Indian AI companies is to create appropriate data. Notably, in 2022, he invested in AI4Bharat, a research lab dedicated to creating open-source datasets, tools, models, and applications for Indian languages. Speaking at Cypher 2024, Vivek Raghavan, the chief of Sarvam AI, revealed that they used Llama 3.1 405B to build Sarvam 2B. He explained that it is a 2 billion parameter model with 4 trillion tokens, of which 2 trillion are Indian language tokens. Sarvam 2B is part of a class of small language models (SLMs) that includes Microsoft's Phi series, Llama 3 (8 billion), and Google's Gemma models. It serves as a viable alternative to using large models, such as those from OpenAI and Anthropic, while also being more efficient for specific use cases. "If you look at the 100 billion tokens in Indian languages, we used a clever method to create synthetic data for building these models using Llama 3.1 405B. We trained the model on 1,024 NVIDIA H100s in India, and it took only 15 days," said Raghavan. Regarding Sarvam 2B, he further said that the model performs well on Indic tasks. "It is extremely good for summarisation in Indian languages, and for any kind of NLP task in Indian languages -- this will outperform models that are much bigger." The company recently launched its latest model, Sarvam-1, which outperforms Google's Gemma-2 and Llama 3.2 on Indic tasks. The claims that its secret sauce is 2 trillion tokens of synthetic Indic data, equivalent to 6-8T regular tokens due to their super-efficient tokenizer. Sarvam AI is not alone. In a recent interaction with AIM, Meta VP Manohar Paluri revealed that even Ola Krutrim uses Llama. "Not just small companies or developers, even large companies build on top of this ecosystem, which really gives us confidence that we have the momentum." He further added that people are now using Llama as their de facto intelligence layer to build their businesses on top of it."We are actually trying to bring high-quality Indian tokens into Llama so that Llama will work in Indian languages," he said. Paluri explained that since Llama is the engine for Meta AI, the AI company will support Indian languages, benefiting billions of people in India who use Meta AI on WhatsApp, Facebook, and Instagram. Meta AI boasts over 500 million monthly active users globally and is on track to become the most widely used AI chatbot by the end of 2024. Recently, it was launched in Hindi and the company plans to integrate multiple Indian languages into their future models. In the Llama 3.1 research paper, Meta states that the model has been trained on multilingual data, generating high-quality instruction-tuning data for languages such as German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Meta has collected high-quality, manually annotated data from linguists and native speakers. Moreover, with a context length of 128K tokens, it can process and generate longer, more complex pieces of text, which is beneficial for creating diverse synthetic datasets. "Synthetic data generation is one of the main use cases for these very large models. This can be extremely helpful in domains where obtaining a large, high-quality dataset is challenging due to cost, privacy concerns, or simply a lack of available data," said Hamid Shojanazeri, ML engineer at Meta. Meta is not limited to LLMs. It intends to empower developers with the resources to create custom agents and discover new types of agentic behaviours.
Share
Share
Copy Link
Meta CEO Mark Zuckerberg announces the development of Llama 4, utilizing over 100,000 NVIDIA H100 GPUs in a record-breaking AI training cluster, with plans for release in early 2025.
Meta, under the leadership of Mark Zuckerberg, is making significant strides in the AI race with its upcoming Llama 4 model. Zuckerberg recently announced that the company is training Llama 4 on a massive AI GPU cluster, surpassing any previously reported setup in the industry [1][2].
The heart of Llama 4's development lies in its unprecedented computing power:
This massive infrastructure underscores Meta's commitment to pushing the boundaries of AI capabilities and maintaining a competitive edge in the rapidly evolving field.
While specific details remain under wraps, Zuckerberg has hinted at several exciting features for Llama 4:
The initial launch of Llama 4 is slated for early 2025, with smaller models expected to be ready first [4][5].
Meta's strategy for Llama 4 stands out in the AI landscape:
The development of Llama 4 is part of Meta's broader AI strategy:
The massive GPU cluster raises significant environmental concerns:
Meta's Llama 4 project is part of an ongoing competition among tech giants:
As the AI race intensifies, the development of Llama 4 represents a significant milestone in Meta's quest for AI dominance, with potential far-reaching implications for the tech industry and society at large.
Reference
[1]
[4]
[5]
Meta's CEO Mark Zuckerberg reveals the company's strategy behind open-sourcing Llama AI models, highlighting cost savings and industry-wide benefits. The development of Llama 4 and its implications for Meta's future in AI are discussed.
2 Sources
Meta has released Llama 3, its latest and most advanced AI language model, boasting significant improvements in language processing and mathematical capabilities. This update positions Meta as a strong contender in the AI race, with potential impacts on various industries and startups.
22 Sources
Meta has released Llama 3.3, a 70 billion parameter AI model that offers performance comparable to larger models at a fraction of the cost, marking a significant advancement in open-source AI technology.
11 Sources
Meta Platforms Inc. has released its latest and most powerful AI model, Llama 3, boasting significant improvements in language understanding and mathematical problem-solving. This open-source model aims to compete with OpenAI's GPT-4 and Google's Gemini.
4 Sources
Meta Platforms unveils Llama 3, a powerful open-source AI model, potentially disrupting the AI industry. The move aims to enhance developer freedom, privacy standards, and Meta's competitive position against rivals like OpenAI and Anthropic.
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved