10 Sources
10 Sources
[1]
Nvidia Becomes a Major Model Maker With Nemotron 3
Nvidia has made a fortune supplying chips to companies working on artificial intelligence, but today the chipmaker took a step toward becoming a more serious model maker itself by releasing a series of cutting-edge open models, along with data and tools to help engineers use them. The move, which comes at a moment when AI companies like OpenAI, Google, and Anthropic are developing increasingly capable chips of their own, could be a hedge against these firms veering away from Nvidia's technology over time. Open models are already a crucial part of the AI ecosystem with many researchers and startups using them to experiment, prototype, and build. While OpenAI and Google offer small open models, they do not update them as frequently as their rivals in China. For this reason and others, open models from Chinese companies are currently much more popular, according to data from Hugging Face, a hosting platform for open source projects. Nvidia's new Nemotron 3 models are among the best that can be downloaded, modified, and run on one's own hardware, according to benchmark scores shared by the company ahead of release. "Open innovation is the foundation of AI progress," CEO Jensen Huang said in a statement ahead of the news. "With Nemotron, we're transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale." Nvidia is taking a more fully transparent approach than many of its US rivals by releasing the data used to train Nemotron -- a fact that should help engineers modify the models more easily. The company is also releasing tools to help with customization and fine-tuning. This includes a new hybrid latent mixture-of-experts model architecture, which Nvidia says is especially good for building AI agents that can take actions on computers or the web. The company is also launching libraries that allow users to train agents to do things using reinforcement learning, which involves giving models simulated rewards and punishments. Nemotron 3 models come in three sizes: Nano, which has 30 billion parameters; Super, which has 100 billion; and Ultra, which has 500 billion. A model's parameters loosely correspond to how capable it is as well as how unwieldy it is to run. The largest models are so cumbersome that they need to run on racks of expensive hardware. Kari Ann Briski, vice president of generative AI software for enterprise at Nvidia, said open models are important to AI builders for three reasons: Builders increasingly need to customize models for particular tasks; it often helps to hand queries off to different models; and it is easier to squeeze more intelligent responses from these models after training by having them perform a kind of simulated reasoning. "We believe open source is the foundation for AI innovation, continuing to accelerate the global economy," Briski said. The social media giant Meta released the first advanced open models under the name Llama in February 2023. As competition has intensified, however, Meta has signaled that its future releases might not be open source. The move is part of a larger trend in the AI industry. Over the past year, US firms have moved away from openness, becoming more secretive about their research and more reluctant to tip off their rivals about their latest engineering tricks.
[2]
As Meta fades in open-source AI, Nvidia senses its chance to lead
Nvidia argues it's more open than Meta with data transparency. Seizing upon a shift in the field of open-source artificial intelligence, chip giant Nvidia, whose processors dominate AI, has unveiled the third generation of its Nemotron family of open-source large language models. The new Nemotron 3 family scales the technology from what had been one-billion-parameter and 340-billion-parameter models, the number of neural weights, to three new models, ranging from 30 billion for Nano, 100 billion for Super, and 500 billion for Ultra. Also: Meta's Llama 4 'herd' controversy and AI contamination, explained The Nano model, available now on the HuggingFace code hosting platform, increases the throughput in tokens per second by four times and extends the context window -- the amount of data that can be manipulated in the model's memory -- to one million tokens, seven times as large as its predecessor. Nvidia emphasized that the models aim to address several concerns for enterprise users of generative AI, who are concerned about accuracy, as well as the rising cost of processing an increasing number of tokens each time AI makes a prediction. "With Nemotron 3, we are aiming to solve those problems of openness, efficiency, and intelligence," said Kari Briski, vice president of generative AI software at Nvidia, in an interview with ZDNET before the release. Also: Nvidia's latest coup: All of Taiwan on its software The Super version of the model is expected to arrive in January, and Ultra is due in March or April. Nvidia, Briski emphasized, has increasing prominence in open-source. "This year alone, we had the most contributions and repositories on HuggingFace," she told me. It's clear to me from our conversation that Nvidia sees a chance to not only boost enterprise usage, thereby fueling chip sales, but to seize leadership in open-source development of AI. After all, this field looks like it might lose one of its biggest stars of recent years, Meta Platforms. Also: 3 ways Meta's Llama 3.1 is an advance for Gen AI When Meta, owner of Facebook, Instagram, and WhatsApp, first debuted its open-source Llama gen AI technology in February 2023, it was a landmark event: a fast, capable model with some code available to researchers, versus the "closed-source," proprietary models of OpenAI, Google, and others. Llama quickly came to dominate developer attention in open-source tech as Meta unveiled fresh innovations in 2024 and scaled up the technology to compete with the best proprietary frontier models from OpenAI and the rest. But 2025 has been different. The company's rollout of the fourth generation of Llama, in April, was greeted with mediocre reviews and even a controversy about how Meta developed the program. These days, Llama models don't show up in the top 100 models on LMSYS's popular LMArena Leaderboard, which is dominated by proprietary models Gemini from Google, xAI's Grok, Anthropic's Claude, OpenAI's GPT-5.2, and by open-source models such as DeepSeek AI, Alibaba's Qwen models, and the Kimi K2 model developed by Singapore-based Moonshot AI. Also: While Google and OpenAI battle for model dominance, Anthropic is quietly winning the enterprise AI race Charts from the third-party firm Artificial Analysis show a similar ranking. Meanwhile, the recent "State of Generative AI" report from venture capitalists Menlo Ventures blamed Llama for helping to reduce the use of open-source in the enterprise. "The model's stagnation -- including no new major releases since the April release of Llama 4 -- has contributed to a decline in overall enterprise open-source share from 19% last year to 11% today," they wrote. Leaderboard scores can come and go, but after a broad reshuffling of its AI team this year, Meta appears poised to place less emphasis on open source. A forthcoming Meta project code-named Avocado, wrote Bloomberg reporters Kurt Wagner and Riley Griffin last week, "may be launched as a 'closed' model -- one that can be tightly controlled and that Meta can sell access to," according to their unnamed sources. The move to closed models "would mark the biggest departure to date from the open-source strategy Meta has touted for years," they wrote. Also: I tested GPT-5.2 and the AI model's mixed results raise tough questions Meta's Chief AI Officer, Alexandr Wang, installed this year after Meta invested in his previous company, Scale AI, "is an advocate of closed models," Wagner and Griffin noted. (An article over the weekend by Eli Tan of The New York Times suggested that there have been tensions between Wang and various product leads for Instagram and advertising inside of Meta.) When I asked Briski about Menlo Ventures's claim that open source is struggling, she replied, "I agree about the decline of Llama, but I don't agree with the decline of open source." Added Briski, "Qwen models from Alibaba are super popular, DeepSeek is really popular -- I know many, many companies that are fine-tuning and deploying DeepSeek." While Llama may have faded, it's also true that Nvidia's own Nemotron family has not yet reached the top of the leaderboards. In fact, the family of models lags DeepSeek, Kimi, and Qwen, and other increasingly popular offerings. Also: Gemini vs. Copilot: I tested the AI tools on 7 everyday tasks, and it wasn't even close But Nvidia believes it is addressing many of the pain points that plague enterprise deployment, specifically. One focus of companies is to "cost-optimize," with a mix of closed-source and open-source models, said Briski. "One model does not make an AI application, and so there is this combination of frontier models and then being able to cost-optimize with open models, and how do I route to the right model." The focus on a selection of models, from Nano to Ultra, is expected to address the need for broad coverage of task requirements. The second challenge is to "specialize" AI models for a mix of tasks in the enterprise, ranging from cybersecurity to electronic design automation and healthcare, Briski said. "When we go across all these verticals, frontier models are really great, and you can send some data to them, but you don't want to send all your data to them," she observed. Open-source tech, then, running "on-premise," is crucial, she said, "to actually help the experts in the field to specialize them for that last mile." Also: Get your news from AI? Watch out - it's wrong almost half the time The third challenge is the exploding cost of tokens, the output of text, images, sound, and other data forms, generated piece by piece when a live model makes predictions. "The demand for tokens from all these models being used is just going up," said Briski, especially with "long-thinking" or "reasoning" models that generate verbose output. "This time last year, each query would take maybe 10 LLM calls," noted Briski. "In January, we were seeing each query making about 50 LLM calls, and now, as people are asking more complex questions, there are 100 LLM calls for every query." To balance demands, such as accuracy, efficiency, and cost, the Nemotron 3 models improve upon a popular approach used to control model costs called "mixture of experts (MOE)," where the model can turn on and off groups of the neural network weights to run with less computing effort. The fresh approach, called "latent mixture of experts," used in the Super and Ultra models, compresses the memory used to store data in the model weights, while multiple "expert" neural networks use the data. Also: Sick of AI in your search results? Try these 8 Google alternatives "We're getting four times better memory usage by reducing the KV-cache," compared to the prior Nemotron, said Briski, referring to the part of a large language model that stores the most-relevant recent search results in response to a query. The more-efficient latent MOE should give greater accuracy at a lower cost while preserving the latency, how fast the first token comes back to the user, and bandwidth, the number of tokens transmitted per second. In data provided by Artificial Analysis, said Briski, Nemotron 3 Nano surpasses a top model, OpenAI's GPT-OSS, in terms of accuracy of output and the number of tokens generated each second. Another big concern for enterprises is the data that goes into models, and Briski said the company aims to be much more transparent with its open-source approach. "A lot of our enterprise customers can't deploy with some models, or they can't build their business on a model that they don't know what the source code is," she said, including training data. The Nemotron 3 release on HuggingFace includes not only the model weights but also trillions of tokens of training data used by Nvidia for pre-training, post-training, and reinforcement learning. There is a separate data set for "agentic safety," which the company says will provide "real-world telemetry to help teams evaluate and strengthen the safety of complex agent systems." "If you consider the data sets, the source code, everything that we use to train is open," said Briski. "Literally, every piece of data that we train the model with, we are releasing." Also: Meta inches toward open source AI with new Llama 3.1 Meta's team has not been as open, she said. "Llama did not release their data sets at all; they released the weights," Briski told me. When Nvidia partnered with Meta last year, she said, to convert the Llama 3.1 models to smaller Nemotron models, via a popular approach known as "distillation," Meta withheld resources from Nvidia. "Even with us as a great partner, they wouldn't even release a sliver of the data set to help distill the model," she said. "That was a recipe we kind of had to come up with on our own." Nvidia's emphasis on data transparency may help to reverse a worrying trend toward diminished transparency. Scholars at MIT recently conducted a broad study of code repositories on HuggingFace. They related that truly open-source postings are on the decline, citing "a clear decline in both the availability and disclosure of models' training data." As lead author Shayne Longpr and team pointed out, "The Open Source Initiative defines open source AI models as those which have open model weights, but also 'sufficiently detailed information about their [training] data'," adding, "Without training data disclosure, a released model is considered 'open weight' rather than 'open source'." It's clear Nvidia and Meta have different priorities. Meta needs to make a profit from AI to reassure Wall Street about its planned spending of hundreds of billions of dollars to build AI data centers. Nvidia, the world's largest company, needs to ensure it keeps developers hooked on its chip platform, which generates the majority of its revenue. Also: US government agencies can use Meta's Llama now - here's what that means Meta CEO Mark Zuckerberg has suggested Llama is still important, telling Wall Street analysts in October, "As we improve the quality of the model, primarily for post-training Llama 4, at this point, we continue to see improvements in usage." However, he also emphasized moving beyond just having a popular LLM with the new directions his newly formed Meta Superintelligence Labs (MSL) will take. "So, our view is that when we get the new models that we're building in MSL in there, and get, like, truly frontier models with novel capabilities that you don't have in other places, then I think that this is just a massive latent opportunity." As for Nvidia, "Large language models and generative AI are the way that you will design software of the future," Briski told me. "It's the new development platform." Support is key, she said, and, in what could be taken as a dig at Zuckerberg's intransigence, thought not intended as such, Briski invoked the words of Nvidia founder and CEO Jensen Huang: "As Jensen says, we'll support it as long as we shall live."
[3]
Nvidia Wants To Be Your Open Source Model Provider
It seems that chips dominance isn't enough for Nvidia. The tech giant wants its products to be present at every level of the AI development ecosystem, and its got its eyes on open-source AI models. Nvidia released its latest Nemotron 3 family of open models on Monday, only months after announcing the first iteration at the company's GTC event in March. The company press release claims they possess "leading accuracy for building agentic AI applications." Open-source AI is often preferred by some users, like firms or governments, for its ability to be customized to the hyper-specific needs of the industry, and for being more private by giving organizations the ability to run the data on their own internal servers instead of the AI company's. American tech giants had largely drawn back from the world of open source models as the companies increasingly opted for proprietary options that they can arguably make more money out of. Largely the only exception to this has been Meta which is also reportedly eyeing a shift to proprietary. Even OpenAI, which purports to be an "open" organization unveiled only its first open source model in five years this past August with the release of gpt-oss-120b and gpt-oss-20b. In Silicon Valley's relative absence, Chinese models have dominated the open source space with high profile releases from DeepSeek and Alibaba. It's all part of China's strategy to achieve global AI dominance: an "AI for all" approach would prove broader reach and greater influence on the global industry, especially at a time when Beijing believes that it can produce AI chips powerful enough to become a sizable competitor to Nvidia's offerings. From the looks of it, Nvidia CEO Jensen Huang is worried about this. He has taken almost every opportunity the past few months to urge Washington to take this possibility seriously. At the company's GTC conference held in D.C. back in October, Huang also warned attendees of China's open source dominance. He argued that because China is a huge creator of open source software, if Americans retreat completely from China they might risk being "ill-prepared" for when Chinese software "permeates the world." Despite not being a big consumer-facing creator of AI modelsΓ’β¬"aside from its AI face generator StyleGAN back in the pre-ChatGPT daysΓ’β¬"Nvidia seems to be taking matters into its own hands with this China situation. Company executives made it clear that it's not proprietary models they are trying to compete with. But when it comes to open source, they want to be a bigger player as fellow American tech giants might be conceding and shifting focus. Nvidia sees Nemotron as a tool to drive further reliance worldwide on its hardware empire. By giving out this open model to developers worldwide, it is hoping to ensure that any new models built will be best aligned with its silicon rather than the rapidly emerging industry in China. "When we're the best development platform, again, people are going to choose us, choose our platform, choose our GPU, in order to be able to not just build today, but also build for tomorrow," Nvidia's vice president of generative AI software for enterprise Kari Briski said at a press briefing ahead of the release. Briski believes that a reliable and consistent roadmap of new model releases can help Nvidia achieve this open model edge. "We recognize that people developing AI applications can't rely on a model when there's only been just one model release and no road map," Briski said, adding that Nvidia will be investing in building that reliability and consistency and is already eyeing new models. "GPT-oss is a fantastic model, OpenAI is a great partner we love them. The last time they put out an open model was over five years ago."
[4]
Nvidia debuts Nemotron 3 with hybrid MoE and Mamba-Transformer to drive efficient agentic AI
Nvidia launched the new version of its frontier models, Nemotron 3, by leaning in on a model architecture that the world's most valuable company said offers more accuracy and reliability for agents. Nemotron 3 will be available in three sizes: Nemotron 3 Nano with 30B parameters, mainly for targeted, highly efficient tasks; Nemotron 3 Super, which is a 100B parameter model for multi-agent applications and with high-accuracy reasoning and Nemotron 3 Ultra, with its large reasoning engine and around 500B parameters for more complex applications. To build the Nemotron 3 models, Nvidia said it leaned into a hybrid mixture-of-experts (MoE) architecture to improve scalability and efficiency. By using this architecture, Nvidia said in a press release that its new models also offer enterprises more openness and performance when building multi-agent autonomous systems. Kari Briski, Nvidia vice president for generative AI software, told reporters in a briefing that the company wanted to demonstrate its commitment to learn and improving from previous iterations of its models. "We believe that we are uniquely positioned to serve a wide range of developers who want full flexibility to customize models for building specialized AI by combining that new hybrid mixture of our mixture of experts architecture with a 1 million token context length," Briski said. Nvidia said early adopters of the Nemotron 3 models include Accenture, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens and Zoom. Breakthrough architectures Nvidia has been using the hybrid Mamba-Transformer mixture-of-experts architecture for many of its models, including Nemotron-Nano-9B-v2. The architecture is based on research from Carnegie Mellon University and Princeton, which weaves in selective state-space models to handle long pieces of information while maintaining states. It can reduce compute costs even through long contexts. Nvidia noted its design "achieves up to 4x higher token throughput" compared to Nemotron 2 Nano and can significantly lower inference costs by reducing reasoning token generation by up 60%. "We really need to be able to bring that efficiency up and the cost per token down. And you can do it through a number of ways, but we're really doing it through the innovations of that model architecture," Briski said. "The hybrid Mamba transformer architecture runs several times faster with less memory, because it avoids these huge attention maps and key value caches for every single token." Nvidia also introduced an additional innovation for the Nemotron 3 Super and Ultra models. For these, Briski said Nvidia deployed "a breakthrough called latent MoE." "That's all these experts that are in your model share a common core and keep only a small part private. It's kind of like chefs sharing one big kitchen, but they need to get their own spice rack," Briski added. Nvidia is not the only company that employs this kind of architecture to build models. AI21 Labs uses it for its Jamba models, most recently in its Jamba Reasoning 3B model. The Nemotron 3 models benefited from extended reinforcement learning. The larger models, Super and Ultra, used the company's 4-bit NVFP4 training format, which allows them to train on existing infrastructure without compromising accuracy. Benchmark testing from Artificial Analysis placed the Nemotron models highly among models of similar size. New environments for models to 'work out' As part of the Nemotron 3 launch, Nvidia will also give users access to its research by releasing its papers and sample prompts, offering open datasets where people can use and look at pre-training tokens and post-training samples, and most importantly, a new NeMo Gym where customers can let their models and agents "workout." The NeMo Gym is a reinforcement learning lab where users can let their models run in simulated environments to test their post-training performance. AWS announced a similar tool through its Nova Forge platform, targeted for enterprises that want to test out their newly created distilled or smaller models. Briski said the samples of post-training data Nvidia plans to release "are orders of magnitude larger than any available post-training data set and are also very permissive and open." Nvidia pointed to developers seeking highly intelligent and performant open models, so they can better understand how to guide them if needed, as the basis for releasing more information about how it trains its models. "Model developers today hit this tough trifecta. They need to find models that are ultra open, that are extremely intelligent and are highly efficient," she said. "Most open models force developers into painful trade-offs between efficiencies like token costs, latency, and throughput." She said developers want to know how a model was trained, where the training data came from and how they can evaluate it.
[5]
Nvidia launches Nemotron 3 model family as open foundation for agentic AI systems - SiliconANGLE
Nvidia launches Nemotron 3 model family as open foundation for agentic AI systems Nvidia Corp. today announced the launch of Nemotron 3, a family of open models and data libraries aimed at powering the next generation of agentic artificial intelligence operations across industries. The new family of models will consist of three sizes: Nano, Super and Ultra. The company built them using a breakthrough architecture using a hybrid latent mixture-of-experts to compress memory needs, optimize compute and deliver exceptional intelligence for size. According to the company, the smallest model, Nemotron 3 Nano, delivers 4x higher throughput than its predecessor while maintaining high performance. "Open innovation is the foundation of AI progress," said Jensen Huang, founder and chief executive of Nvidia Corp. "With Nemotron, we're transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale." The company said the release of these AI models addresses the aggressive acceleration of industry adoption of agentic AI applications that require reasoning and tool orchestration. The era of single-model chatbots is giving way to orchestrated AI applications that automate multiple models to power proactive and intelligent agents. The Nemotron 3 launch also positions Nvidia more directly against a growing field of open and semi-open reasoning models, as competition shifts from raw parameter counts toward orchestration, reliability and agent-centric performance. Using its new architecture, Nvidia said it will help drive costs down, while still providing high-speed, reliable reasoning and intelligence. Launching today, Nano is a small 30-billion-parameter model with 3 billion active parameters, aimed at targeted, highly efficient tasks. The next larger model, Super, combines 100 billion parameters with 10 billion active parameters and is designed to provide mid-range intelligence for multi-agent applications. Finally, Ultra, a large reasoning engine weighing in at 500 billion parameters with 50 billion active, delivers powerful reasoning for complex AI applications and agentic orchestration. Nemotron Super excels at applications that bring together multiple AI agents collaborating on complex tasks with low latency. Nemotron Ultra, by contrast, is positioned as a powerhouse "brain" at the center of demanding AI workflows that require deep research and long-horizon strategic planning. Nano is available now, Super will arrive during the first quarter of 2026 and Ultra is expected in the first half of next year. Using the company's ultra-efficient 4-bit NVPF4 training format, developers and engineers can deploy the models on Nvidia's Blackwell architecture across smaller numbers of graphics processing units, with a significantly reduced memory footprint. The efficient training process also allows the models to scale down through distillation without notable losses to accuracy or reasoning capability. Early adopters of the Nemotron family include Accenture plc, CrowdStrike Holdings Inc., Oracle Cloud Infrastructure, Palantir Technologies Inc., Perplexity AI Inc., ServiceNow Inc., Siemens AG and Zoom Communications Inc. "Perplexity is built on the idea that human curiosity will be amplified by accurate AI built into exceptional tools, like AI assistants," said Aravind Srinivas, CEO of Perplexity. "With our agent router, we can direct workloads to the best fine-tuned open models, like Nemotron 3 Ultra." With the release of these models, Nvidia is betting heavily on ecosystem-driven adoption to create a mutually beneficial symbiosis. Although the new models do not require Nvidia hardware to run, they are highly optimized for Nvidia-designed platforms and graphics cards due to internal architectural optimizations. The company also emphasized its intention to maintain a reliability roadmap for model releases. While the current cadence of AI model launches can feel relentless, with new releases appearing almost monthly, Nvidia said it aims to provide developers with clearer expectations around model maturity and long-term support within each open-source family. In addition to the new models, Nvidia is releasing a collection of training datasets and state-of-the-art reinforcement learning libraries for building specialized AI agents. Reinforcement learning is a method of training AI models by exposing them to real-world questions, instructions and tasks. Unlike supervised learning, which relies on predefined question-and-answer datasets, reinforcement learning places models in uncertain environments, reinforcing successful actions through rewards and penalizing mistakes. The result is a reasoning system capable of operating in complex, dynamic conditions while learning rules and boundaries through active feedback. The new dataset contains three trillion tokens of new Nemotron pretraining, post-training and reinforcement to supply rich reasoning, coding and multistep workflow examples. These provide the framework for building highly capable, domain-specialized agents. In addition, Nvidia released the Nemotron Agentic Safety Dataset, a real-world telemetry dataset that allows teams to evaluate and strengthen safety in complex and multi-agent systems. Building on this, Nvidia released the NeMo Gym and NeMo RL open-source libraries. Gym and RL provide training environments and a post-training foundation for Nemotron models. Developers can use them together as scaffolding to accelerate development by running models through test environments using RL training loops and interoperate with existing training environments. The release of the open-source Nemotron models and datasets further cements Nvidia's outreach to the broader AI ecosystem. The company said the data addresses a growing industry challenge: developers struggle to trust the behavior of models deployed into production. By providing transparent datasets and tooling, Nvidia aims to help teams define operational boundaries, train models for specific tasks and more rigorously evaluate reliability before deployment.
[6]
NVIDIA Unveils Nemotron 3 Open Models to Power Multi-Agent AI Systems | AIM
The Nemotron 3 lineup includes Nano, Super and Ultra models built on a hybrid latent mixture-of-experts (MoE) architecture. NVIDIA on Monday announced the NVIDIA Nemotron 3 family of open models, datasets and libraries aimed at supporting the development of transparent and efficient multi-agent AI systems across industries. The Nemotron 3 lineup includes Nano, Super and Ultra models built on a hybrid latent mixture-of-experts (MoE) architecture, which NVIDIA says is designed to reduce inference costs, limit context drift and improve coordination among multiple AI agents. "Open innovation is the foundation of AI progress," NVIDIA founder and CEO Jensen Huang said. "With Nemotron, we're transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale." Among the three models, Nemotron 3 Nano is available immediately. It is a 30-billion-parameter model that activates up to 3 billion parameters per task and is optimised for low-cost inference use cases such as software debugging, summarisation and AI assistants. NVIDIA said the model delivers up to four times higher token throughput than Nemotron 2 Nano and reduces reasoning token generation by up to 60%. It is available on Hugging Face and through inference providers such as Baseten, DeepInfra, Fireworks, FriendliAI, OpenRouter and Together AI. The model is also offered as an NVIDIA NIM microservice for deployment on NVIDIA-accelerated infrastructure. Nemotron 3 Nano will also be available on AWS via Amazon Bedrock and supported on multiple cloud platforms in the coming months. On the other hand, Nemotron 3 Super is a roughly 100-billion-parameter model, designed for multi-agent applications requiring low latency, while Nemotron 3 Ultra, with about 500 billion parameters, is intended for deep reasoning and long-horizon planning tasks. Both Super and Ultra use NVIDIA's 4-bit NVFP4 training format on Blackwell GPUs to reduce memory requirements. These models are expected to be available in the first half of 2026. The launch comes as companies move beyond single AI chatbots toward collaborative agent-based systems, where multiple models work together on complex workflows. According to NVIDIA, Nemotron 3 allows developers to route tasks between frontier proprietary models and open Nemotron models within the same workflow to balance reasoning capability and cost efficiency. NVIDIA said the Nemotron 3 family also aligns with its sovereign AI strategy, allowing governments and enterprises to deploy models tailored to local data, regulations and policy requirements. Organisations across Europe and South Korea are among those adopting the open models, the company said. Several enterprise customers and partners, including Accenture, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys and Zoom, are integrating Nemotron models into AI workflows spanning manufacturing, cybersecurity, software development and communications. Perplexity CEO Aravind Srinivas said the company is using Nemotron within its agent routing system to optimise performance. "We can direct workloads to fine-tuned open models like Nemotron 3 Ultra or use proprietary models when tasks require it," he said. Alongside the models, NVIDIA released three trillion tokens of pretraining, post-training and reinforcement learning datasets, including an Agentic Safety Dataset for evaluating multi-agent systems. The company also open-sourced NeMo Gym, NeMo RL and NeMo Evaluator to support training, customisation and evaluation of agentic AI.
[7]
Nvidia Bets on Open Models to Power AI Agents | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. The lineup includes Nano, Super and Ultra models that leverage a hybrid latent mixture-of-experts (MoE) architecture to deliver higher throughput, extended context reasoning, and scalable performance for multi-agent workflows. Developers and enterprises can access these models, associated data and tooling to build and customize AI agents for tasks ranging from coding and reasoning to complex workflow automation. Open-source AI models are large pretrained neural networks whose weights and code are publicly available for download, inspection, modification and redistribution. By contrast with closed or proprietary models controlled by a single provider, open models enable developers, researchers and enterprises to adapt the model to specific needs, verify behavior, and integrate the technology into their systems without restrictive licensing. True open-source AI, according to industry definitions, ideally also includes transparency around training data and methodologies to ensure trust and reproducibility. Open-source models matter because they widen access to advanced AI, let independent developers tailor systems to specific domains, and provide transparency that supports safety, auditability and regulatory compliance, In the broader AI landscape, open models occupy a dynamic middle ground. Tech giants such as Nvidia, Meta and OpenAI have experimented with open or "open-weight" releases alongside proprietary offerings. Open models often lag one generation behind the most advanced closed models in raw capability, but they compensate with flexibility and reduced cost, making them attractive for bespoke applications and edge deployment. As open models improve rapidly, this performance gap is shrinking, driving wider adoption, as covered by Time. Nvidia's decision to release Nemotron 3 as an open model reflects a deliberate expansion beyond its traditional role as a hardware supplier into a more direct provider of foundational AI software. While Nvidia has released models before, including earlier Nemotron variants and domain-specific models used by partners and internal teams, those efforts were largely positioned as reference systems or tied closely to enterprise services. With Nemotron 3, the company is making model weights, datasets, reinforcement learning environments and supporting libraries broadly available, signaling a deeper commitment to open-source development. The move is closely tied to how Nvidia sees enterprise demand evolving. As companies deploy AI agents across operations, many want models they can run on their own infrastructure, inspect for behavior, fine-tune with proprietary data and integrate tightly with existing systems. Open models provide that flexibility while still reinforcing demand for Nvidia's GPUs, networking and software tools. Previous Nvidia open or semi-open models have generally performed competitively in enterprise benchmarks, particularly in reasoning, instruction following and agentic workflows, even if they have not always matched the absolute frontier of closed models. Nvidia says Nemotron 3 builds on that foundation with a hybrid architecture that combines mixture-of-experts techniques with newer sequence modeling approaches to improve efficiency and reasoning performance. The company has cited work with companies such as Accenture, Cursor, EY, Palantir, Perplexity and Zoom, where open models are used alongside proprietary systems to balance performance, cost and governance requirements. Nvidia's open-source push comes as competition intensifies globally, particularly from China, where open-access models have advanced rapidly. Chinese firms and research labs have released models that rival Western counterparts on several benchmarks while emphasizing lower training and inference costs. Those models have gained significant traction on developer platforms, reshaping the competition for open AI. Usage data from developer platforms suggests Chinese open models are among the most downloaded and deployed globally, prompting analysts to argue that China's emphasis on accessibility has become a strategic advantage, as reported by the Financial Times. One prominent example is DeepSeek-V3, developed by the Hangzhou-based startup DeepSeek. The model has drawn attention for delivering strong reasoning and coding performance while relying on more efficient architectures that reduce dependence on the most advanced GPUs. Alibaba Cloud's Qwen family represents another major force in the open-model ecosystem. The Qwen lineup spans multiple model sizes and tasks and has seen broad adoption across enterprise and consumer applications, with frequent appearances near the top of open-model rankings on developer platforms such as Hugging Face.
[8]
Nvidia unveils new open-source AI models amid boom in Chinese offerings
SAN FRANCISCO -- Nvidia on Monday unveiled a new family of open-source artificial intelligence models that it says will be faster, cheaper and smarter than its previous offerings, as open-source offerings from Chinese AI labs proliferate. Nvidia is primarily known for providing chips that firms such as OpenAI use to train their closed-source models and charge money for them. But it also offers a slew of its own models for everything from physics simulations to self-driving vehicles as open-source software that can be used by researchers or by other companies, with firms such as Palantir Technologies weaving Nvidia's model into their products. Nvidia on Monday revealed the third generation of its "Nemotron" large-language models aimed at writing, coding and other tasks. The smallest of the models, called Nemotron 3 Nano, was being released Monday, with two other, larger versions coming in the first half of 2026. Nvidia, which has become the world's most valuable listed company, said that Nemotron 3 Nano was more efficient than its predecessor - meaning it would be cheaper to run - and would do better at long tasks with multiple steps. Nvidia is releasing the models as open-source offerings from Chinese tech firms such as DeepSeek, Moonshot AI and Alibaba Group Holdings are becoming widely used in the tech industry, with companies such as Airbnb disclosing use of Alibaba's Qwen open-source model. At the same time, CNBC and Bloomberg have reported that Meta Platforms is considering shifting toward closed-source models, leaving Nvidia as one of the most prominent U.S. providers of open-source offerings. Many U.S. states and government entities have banned use of Chinese models over security concerns. Kari Briski, vice president of generative AI software for enterprise at Nvidia, said the company aimed to provide a "model that people can depend on," and was also openly releasing its training data and other tools so that government and business users could test it for security and customize it. "This is why we're treating it like a library," Briski told Reuters in an interview. "This is why we're committed to it from a software engineering perspective."
[9]
Nvidia launches new open-source AI models amid rise of Chinese alternatives
On Monday Nvidia unveiled a new generation of open-source artificial intelligence models, marking a strategic offensive in response to the rise of models developed by Chinese labs such as DeepSeek, Alibaba and Moonshot AI. The American group, known for its chips essential to training proprietary models such as those from OpenAI, is thus strengthening its position in the open-model market amid growing technological rivalry. Dubbed Nemotron 3 , this new series includes models designed for complex tasks such as writing and coding. The first model, Nemotron 3 Nano, is already available and stands out for greater energy efficiency and better performance on multi-step tasks. Two more powerful versions are expected in the first half of 2026. Nvidia also plans to release the training data and evaluation tools, enabling companies and institutions to audit or adapt the models to their needs. As some Western companies now adopt Chinese models, such as Airbnb with Qwen from Alibaba, Nvidia aims to offer a credible, controlled alternative, in a climate where questions of technological sovereignty and data security are gaining momentum. Unlike Meta, which is reportedly considering a return to closed models, Nvidia reinstates its commitment to open source and positions itself as a reliable US supplier for businesses and governments.
[10]
NVIDIA powers the agentic era: The Nemotron 3 debut explained
Open weight NVIDIA Nemotron models drive sovereign AI agentic systems The era of the solitary chatbot is fading. As 2025 draws to a close, the artificial intelligence industry is pivoting aggressively toward "agentic AI" - systems where multiple specialized AI agents collaborate to solve complex problems. Leading this charge is NVIDIA, which has just unveiled its Nemotron 3 family of open models, a release that promises to reshape how enterprises build and deploy these digital workforces. Also read: Sergey Brin admits Google "messed up" and didn't take LLMs seriously The release of Nemotron 3 underscores a vital shift toward "open weight" models, a category that offers a middle ground between closed proprietary systems and fully open-source software. In an open weight model, the developer releases the trained neural network weights - allowing the public to download, run, and fine-tune the model on their own hardware - while often keeping the training data and deeper methodologies proprietary. This approach has exploded in popularity throughout 2024 and 2025. Google championed this early with its Gemma family, lightweight models derived from its Gemini research that allowed developers to build efficient apps on consumer hardware. Similarly, OpenAI entered the fray in August 2025 with gpt-oss, a family of open-weight reasoning models that gave enterprises the ability to run GPT-class reasoning on their own infrastructure, bypassing API dependencies for sensitive tasks. NVIDIA is now taking this concept further by optimizing open weights specifically for multi-agent workflows. At the heart of the announcement are three distinct models, each sized for a specific role in an AI agent's workflow. All three utilize a breakthrough "hybrid latent mixture-of-experts" (MoE) architecture, a design that allows the models to boast massive parameter counts while only "activating" a small fraction of them for any given task. This ensures the models remain computationally efficient without sacrificing their intellectual depth. Nemotron 3 Nano is the fleet-footed scout of the family. Available immediately, this 30-billion-parameter model is optimized for edge computing and high-frequency tasks like software debugging and information retrieval. Despite its size, it activates only about 3 billion parameters per token. NVIDIA claims it delivers four times the throughput of its predecessor, Nemotron 2 Nano, and fits comfortably within a 1-million-token context window. For developers, this means the ability to run capable agents locally or at the edge with minimal latency. Also read: AI-controlled robot shoots human after prompt manipulation, raising serious safety risks Nemotron 3 Super, scheduled for the first half of 2026, is the reliable middle manager. With approximately 100 billion parameters (10 billion active), it strikes a balance between speed and depth. It is engineered for high-accuracy reasoning in collaborative environments where multiple agents need to communicate and hand off tasks without losing context or hallucinating details. Nemotron 3 Ultra represents the heavy artillery. Also arriving in 2026, this 500-billion-parameter giant (50 billion active) serves as the "reasoning engine" for the most complex strategic planning and deep research tasks. It is designed to sit at the center of an agentic workflow, directing the smaller Nano and Super models and handling problems that require long-horizon thinking. NVIDIA's strategy extends beyond just model weights. By releasing the Nemotron 3 family alongside open-source tools like NeMo Gym and NeMo RL, the company is providing the "gymnasium" where these agents can be trained and refined. This is particularly critical for "sovereign AI" initiatives, where nations and large corporations seek to build AI systems that align with local regulations, languages, and values, something impossible to achieve with a black-box model hosted on a foreign cloud. With early adopters like Accenture, Siemens, and CrowdStrike already integrating these models, Nemotron 3 is poised to become the standard reference architecture for the next generation of AI: a generation where models don't just talk to humans, but to each other.
Share
Share
Copy Link
Nvidia unveiled its Nemotron 3 family of open-source AI models, ranging from 30 billion to 500 billion parameters, with full training data transparency. The chipmaker is positioning itself to lead open-source AI development as Meta signals a potential shift away from its Llama open models toward proprietary systems.

Nvidia has released its Nemotron 3 family of open-source AI models, marking a strategic shift for the chipmaker from hardware supplier to serious model maker in the AI development ecosystem
1
. The new large language models come in three sizes: Nano with 30 billion parameters, Super with 100 billion, and Ultra with 500 billion parameters1
. CEO Jensen Huang emphasized that "open innovation is the foundation of AI progress," positioning Nemotron as a platform that gives developers transparency and efficiency for building agentic AI systems at scale1
.The timing proves significant as Meta, which pioneered advanced open models with Llama in February 2023, appears to be retreating from open-source AI
2
. Llama models no longer appear in the top 100 on LMSYS's LMArena Leaderboard, and Bloomberg reports suggest Meta's forthcoming "Avocado" project may launch as a closed model2
. Venture capital firm Menlo Ventures noted that enterprise open-source share declined from 19% to 11%, partly blaming Llama's stagnation with no major releases since April2
.Nvidia built the Nemotron 3 models using a breakthrough hybrid latent mixture-of-experts architecture combined with Mamba-Transformer design
4
. Kari Briski, Nvidia's vice president of generative AI software for enterprise, explained that this architecture achieves up to 4x higher throughput compared to Nemotron 2 Nano while reducing reasoning token generation by up to 60%4
. The Nano model extends the context window to one million tokens, seven times larger than its predecessor2
.The mixture-of-experts architecture allows experts within the model to share a common core while keeping only small parts private, improving both scalability and efficiency
4
. Using Nvidia's ultra-efficient 4-bit NVFP4 training format, the larger Super and Ultra models can train on existing infrastructure without compromising accuracy4
. This design compresses memory needs and optimizes compute while delivering exceptional intelligence for size5
.Nvidia is taking a more transparent approach than many US rivals by releasing the training data used to build Nemotron 3, along with tools for customization and fine-tuning
1
. The company is releasing three trillion tokens of pretraining, post-training, and reinforcement learning dataβsamples that are "orders of magnitude larger than any available post-training data set," according to Briski4
. This level of openness addresses enterprise concerns about understanding how models are trained and where training data originates4
.Nvidia is also launching NeMo Gym, a reinforcement learning lab where users can test their models and agents in simulated environments
4
. Reinforcement learning involves giving models simulated rewards and punishments, creating reasoning systems capable of operating in complex, dynamic conditions5
. Early adopters include Accenture, CrowdStrike, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, and Zoom4
.Related Stories
The move addresses the growing dominance of Chinese open-source models from DeepSeek and Alibaba's Qwen, which currently lead in popularity on platforms like Hugging Face as American tech giants have become more secretive
1
. Briski told reporters that while she agrees with the decline of Llama, she doesn't agree with claims about open-source decline overall, pointing to the popularity of Qwen and DeepSeek models2
. Nvidia emphasized it had the most contributions and repositories on HuggingFace this year2
.The chipmaker's push into model development may also hedge against AI companies like OpenAI, Google, and Anthropic developing their own chips, potentially reducing reliance on Nvidia hardware over time
1
. By providing open models optimized for its silicon, Nvidia aims to ensure new models built worldwide remain aligned with its hardware empire rather than emerging competitors in China3
. "When we're the best development platform, people are going to choose us, choose our platform, choose our GPU," Briski said3
.Nemotron 3 Nano is available now on HuggingFace, with Super expected in January and Ultra due in March or April
2
. Briski emphasized that reliability and a consistent roadmap of releases are critical for developers who "can't rely on a model when there's only been just one model release and no road map"3
. She noted that OpenAI's last open model release was over five years ago before its recent gpt-oss models in August3
.The AI ecosystem increasingly demands customization for particular tasks, with enterprises needing to hand queries off to different models and squeeze more intelligent responses through simulated reasoning
1
. Nemotron 3 addresses what Briski called "a tough trifecta"βfinding models that are ultra-open, extremely intelligent, and highly efficient without forcing painful trade-offs between token costs, latency, and throughput4
. The models aim to power the next generation of agentic AI operations that automate multiple models for proactive and intelligent agents, moving beyond single-model chatbots5
.Summarized by
Navi
[4]
19 Mar 2025β’Technology

07 Jan 2025β’Technology

18 Oct 2024β’Technology

1
Technology

2
Technology

3
Policy and Regulation
