Curated by THEOUTPOST
On Wed, 19 Mar, 12:11 AM UTC
5 Sources
[1]
Nvidia debuts Llama Nemotron open reasoning models in a bid to advance agentic AI
Nvidia is getting into the open source reasoning model market. At the Nvidia GTC event today, the AI giant made a series of hardware and software announcements. Buried amidst the big silicon announcements, the company announced a new set of open source Llama Nemotron reasoning models to help accelerate agentic AI workloads. The new models are an extension of the Nvidia Nemotron models that were first announced in January at the Consumer Electronics Show (CES). The new Llama Nemotron reasoning models are in part a response to the dramatic rise of reasoning models in 2025. Nvidia (and its stock price) were rocked to the core earlier this year when DeepSeek R1 came out, offering the promise of an open source reasoning model and superior performance. The Llama Nemotron family models are competitive with DeepSeek offering business-ready AI reasoning models for advanced agents. "Agents are autonomous software systems designed to reason, plan, act and critique their work," Kari Briski, vice president of Generative AI Software Product Managements at Nvidia said during a GTC pre-briefing with press. "Just like humans, agents need to understand context to breakdown complex requests, understand the user's intent, and adapt in real time." What's inside Llama Nemotron for agentic AI As the name implies Llama Nemotron is based on Meta's open source Llama models. With Llama as the foundation, Briski said that Nvidia algorithmically pruned the model to optimize compute requirements while maintaining accuracy. Nvidia also applied sophisticated post-training techniques using synthetic data. The training involved 360,000 H100 inference hours and 45,000 human annotation hours to enhance reasoning capabilities. All that training results in models that have exceptional reasoning capabilities across key benchmarks for math, tool calling, instruction following and conversational tasks, according to Nvidia. The Llama Nemotron family has three different models The family includes three models targeting different deployment scenarios: For availability, Nano and Super are now available at NIM micro services and can be downloaded at AI.NVIDIA.com. Ultra is coming soon. Hybrid reasoning helps to advance agentic AI workloads One of the key features in Nvidia Llama Nemotron is the ability to toggle reasoning on or off. The ability to toggle reasoning is an emerging capability in the AI market. Anthropic Claude 3.7 has a somewhat similar functionality, though that model is a closed proprietary model. In the open source space IBM Granite 3.2 also has a reasoning toggle that IBM refers to as - conditional reasoning. The promise of hybrid or conditional reasoning is that it allows systems to bypass computationally expensive reasoning steps for simple queries. In a demonstration, Nvidia showed how the model could engage complex reasoning when solving a combinatorial problem but switch to direct response mode for simple factual queries. Nvidia Agent AI-Q blueprint provides an enterprise integration layer Recognizing that models alone aren't sufficient for enterprise deployment, Nvidia also announced the Agent AI-Q blueprint, an open-source framework for connecting AI agents to enterprise systems and data sources. "AI-Q is a new blueprint that enables agents to query multiple data types -- text, images, video -- and leverage external tools like web search and other agents," Briski said. "For teams of connected agents, the blueprint provides observability and transparency into agent activity, allowing developers to improve the system over time." The AI-Q blueprint is set to become available in April Why this matters for enterprise AI adoption For enterprises considering advanced AI agent deployments, Nvidia's announcements address several key challenges. The open nature of Llama Nemotron models allows businesses to deploy reasoning-capable AI within their own infrastructure. That's important as it can address data sovereignty and privacy concerns that can have limited adoption of cloud-only solutions. By building the new models as NIMs, Nvidia is also making it easier for organizations to deploy and manage deployments, whether on-premises or in the cloud. The hybrid, conditional reasoning approach is also important to note as it provides organizations with another option to choose from for this type of emerging capability. Hybrid reasoning allows enterprises to optimize for either thoroughness or speed, saving on latency and compute for simpler tasks while still enabling complex reasoning when needed. As enterprise AI moves beyond simple applications to more complex reasoning tasks, Nvidia's combined offering of efficient reasoning models and integration frameworks positions companies to deploy more sophisticated AI agents that can handle multi-step logical problems while maintaining deployment flexibility and cost efficiency.
[2]
NVIDIA Launches Family of Open Reasoning AI Models for Developers and Enterprises to Build Agentic AI Platforms
GTC -- NVIDIA today announced the open Llama Nemotron family of models with reasoning capabilities, designed to provide developers and enterprises a business-ready foundation for creating advanced AI agents that can work independently or as connected teams to solve complex tasks. Built on Llama models, the NVIDIA Llama Nemotron reasoning family delivers on-demand AI reasoning capabilities. NVIDIA enhanced the new reasoning model family during post-training to improve multistep math, coding, reasoning and complex decision-making. This refinement process boosts accuracy of the models by up to 20% compared with the base model and optimizes inference speed by 5x compared with other leading open reasoning models. The improvements in inference performance mean the models can handle more complex reasoning tasks, enhance decision-making capabilities and reduce operational costs for enterprises. Leading agent AI platform pioneers -- including Accenture, Amdocs, Atlassian, Box, Cadence, CrowdStrike, Deloitte, IQVIA, Microsoft, SAP and ServiceNow -- are collaborating with NVIDIA on its new reasoning models and software. "Reasoning and agentic AI adoption is incredible," said Jensen Huang, founder and CEO of NVIDIA. "NVIDIA's open reasoning models, software and tools give developers and enterprises everywhere the building blocks to create an accelerated agentic AI workforce." NVIDIA Post-Training Boosts Accuracy and Reliability for Enterprise Reasoning Built to deliver production-ready AI reasoning, the Llama Nemotron model family is available as NVIDIA NIMâ„¢ microservices in Nano, Super and Ultra sizes -- each optimized for different deployment needs. The Nano model delivers the highest accuracy on PCs and edge devices, the Super model offers the best accuracy and highest throughput on a single GPU, and the Ultra model will provide maximum agentic accuracy on multi-GPU servers. NVIDIA conducted extensive post-training on NVIDIA DGXâ„¢ Cloud using high-quality curated synthetic data generated by NVIDIA Nemotronâ„¢ and other open models, as well as additional curated datasets cocreated by NVIDIA. The tools, datasets and post-training optimization techniques used to develop the models will be openly available, giving enterprises the flexibility to build their own custom reasoning models. Agentic Platforms Team With NVIDIA to Enhance Reasoning for Industries Agentic AI platform industry leaders are working with the Llama Nemotron reasoning models to deliver advanced reasoning to enterprises. Microsoft is integrating Llama Nemotron reasoning models and NIM microservices into Microsoft Azure AI Foundry. This expands the Azure AI Foundry model catalog with options for customers to enhance services like Azure AI Agent Service for Microsoft 365. SAP is tapping Llama Nemotron models to advance SAP Business AI solutions and Joule, the AI copilot from SAP. Additionally, it is using NVIDIA NIM and NVIDIA NeMoâ„¢ microservices to promote increased code completion accuracy for SAP ABAP programming language models. "We are collaborating with NVIDIA to integrate Llama Nemotron reasoning models into Joule to enhance our AI agents, making them more intuitive, accurate and cost effective," said Walter Sun, global head of AI at SAP. "These advanced reasoning models will refine and rewrite user queries, enabling our AI to better understand inquiries and deliver smarter, more efficient AI-powered experiences that drive business innovation." ServiceNow is harnessing Llama Nemotron models to build AI agents that offer greater performance and accuracy to enhance enterprise productivity across industries. Accenture has made NVIDIA Llama Nemotron reasoning models available on its AI Refinery platform -- including new industry agent solutions announced today -- to enable clients to rapidly develop and deploy custom AI agents tailored to industry-specific challenges, accelerating business transformation. Deloitte is planning to incorporate Llama Nemotron reasoning models into its recently announced Zora AI agentic AI platform designed to support and emulate human decision-making and action with agents that include deep functional- and industry-specific business knowledge and built-in transparency. NVIDIA AI Enterprise Delivers Essential Tools for Agentic AI Developers can deploy NVIDIA Llama Nemotron reasoning models with new NVIDIA agentic AI tools and software to streamline the adoption of advanced reasoning in collaborative AI systems. All part of the NVIDIA AI Enterprise software platform, the latest agentic AI building blocks include: Availability The NVIDIA Llama Nemotron Nano and Super models and NIM microservices are available as a hosted application programming interface from build.nvidia.com and Hugging Face. Access for development, testing and research is free for members of the NVIDIA Developer Program. Enterprises can run Llama Nemotron NIM microservices in production with NVIDIA AI Enterprise on accelerated data center and cloud infrastructure. Developers can sign up to be notified when NVIDIA NeMo microservices are publicly available. The NVIDIA AI-Q Blueprint is expected to be available in April. The NVIDIA AgentIQ toolkit is available now on GitHub.
[3]
Nvidia's new reasoning models and building blocks pave way for next-gen AI agents - SiliconANGLE
Nvidia's new reasoning models and building blocks pave way for next-gen AI agents Nvidia Corp. is looking to capitalize on the agentic artificial intelligence trend not only by providing the underlying infrastructure, but also the models that power these next-generation autonomous agents. At its GTC 2025 annual conference today, the company unveiled a new family of Llama Nemotron AI models with advanced reasoning capabilities. Based on Meta Platforms Inc.'s renowned open-source Llama models, they're designed to provide developers with a strong foundation on which they can build advanced AI agents that perform tasks on behalf of their users with minimal supervision. Nvidia explained that it basically just took Meta's Llama models and improved them using post-training enhancement techniques to increase their multistep math, coding, complex decision-making and reasoning skills. Employing some careful refinements, Nvidia claims that the Llama Nemotron AI models are 20% more accurate than the Llama models they're based on, while their inference speed has been increased by an impressive five times, enabling them to handle many more complex tasks with lower operational costs. The Llama Nemotron models are being made available through Nvidia's NIM microservices platform in three sizes - Nano, Super and Ultra - optimized for different kinds of applications. According to Nvidia, Llama Nemotron Nano is designed for deployment on personal computers, edge devices and other low-powered hardware. The Super model is optimized to run on a single graphics processing unit, while the Ultra version is meant for those who need maximum performance running on multiple GPU servers. The company said it carried out its post-training refinements using the Nvidia DGX Cloud platform with high-quality synthetic data from Nvidia Nemotron, plus its own proprietary, curated datasets. To keep things open, it's making these datasets, the tools it used and details of its post-training optimization techniques publicly available, so everyone can see the improvements and develop their own foundational reasoning models. Although just announced today, Nvidia has already amassed an impressive list of partners that are using the Llama Nemotron models to create powerful new AI agents. For instance, Microsoft Corp. is making them available on its cloud-based Azure AI Foundry service, and they'll also be listed as an option for customers to create new agents using the Azure AI Agent Service for Microsoft 365. Another partner, SAP SE, is utilizing the Llama Nemotron models to improve the capabilities of its AI assistant Joule and its SAP Business AI solutions portfolio. And others, including Accenture Plc, Atlassian Corp., Box Inc. and ServiceNow Inc., are also working with Nvidia to ensure their customers can access the Llama Nemotron models. Of course, for anyone looking to create AI agents, the underlying large language models are just one part of the equation. There's also the infrastructure to consider, the tools needed to piece them together, the all-important data pipelines to provide them with knowledge, and much more. Nvidia is catering to most of these needs, announcing a host of additional agentic AI building blocks at GTC 2025 today. They include the new Nvidia AI-Q Blueprint, which is a framework that enables developers to connect knowledge bases to AI agents that can act autonomously. The Blueprint was built with Nvidia NIM microservices and integrates with Nvidia NeMo Retriever, making it simple for AI agents to retrieve multimodal data in various formats. Meanwhile, the new Nvidia AI Data Platform is a customizable reference design that's being made available to the world's most important storage providers. The idea is to help storage infrastructure providers such as Dell Technologies Inc., Hewlett Packard Enterprise Co., Hitachi Vantara, IBM Corp., NetApp Inc.. Nutanix Inc., Vast Data Inc. and Pure Storage Inc. develop more efficient data platforms for agentic AI inference workloads. By combining highly optimized storage resources with Nvidia's accelerated computing hardware, the company promises that developers will see some major performance gains when it comes to AI reasoning, as it will ensure the smooth flow of information from database to model. There's also some updated Nvidia NIM microservices, which are used to optimize agentic AI inference to support continuous learning and adaptiveness. Using these microservices, customers will be able to reliably deploy the latest and most powerful agentic AI models, including Nvidia's Llama Nemotron and alternatives from the likes of Meta, Microsoft and Mistral AI. Finally, Nvidia said it's enhancing its NeMo microservices, which provide a framework for developers to build robust and efficient data flywheels. This is key to ensuring that AI agents can learn continuously based on both human- and AI-generated feedback. Sticking with AI agents, Nvidia also revealed it's expanding its association with Oracle Corp. to bring agentic AI to Oracle Cloud Infrastructure. Under the partnership, Nvidia is bringing its accelerated GPUs and inference software to Oracle's cloud infrastructure and making it compatible with that company's generative AI services. It will help to accelerate AI agent development on OCI. All told, Nvidia now offers more than 160 AI tools and NIM microservices available natively via the OCI console. Further, the companies announced they're also working to accelerate vector search on the Oracle Database 23ai platform. Moving away from its focus on AI agents, Nvidia also provided an update on its expanded collaborations with Google LLC, revealing a series of initiatives that aim to enhance and improve access to AI and its underlying tooling. Nvidia said it will become the first organization to leverage Google DeepMind's SynthID, which directly embeds digital watermarks into AI-generated images, video and text. That helps preserve the integrity of AI outputs. SynthID is initially being integrated with Nvidia's Cosmos World foundation models, where it will provide safeguards against misinformation and wrongful attribution. Elsewhere, Nvidia helped Google's DeepMind researchers to optimize a family of open-source, lightweight AI models called Gemma to run on its GPUs, and they're also working on an initiative to build AI-powered robots with grasping skills, and various other projects. "It's a great joy to see Google and Nvidia researchers and engineers collaborate to solve incredible challenges, from drug discovery to robotics," said Nvidia Chief Executive Jensen Huang.
[4]
Llama Nemotron: Nvidia's answer to the AI reasoning boom
Nvidia has introduced a new set of open source Llama Nemotron reasoning models during its GTC event, aimed at enhancing agentic AI workloads. These models build upon the Nvidia Nemotron models first announced at the Consumer Electronics Show (CES) in January. The launch of the Llama Nemotron reasoning models is partially a response to the surge in reasoning models witnessed in 2025. Nvidia's market position faced challenges earlier this year with the introduction of DeepSeek R1, which promised an open source reasoning model with superior performance. The Llama Nemotron family is designed to provide competitive business-ready AI reasoning models for advanced agents. "Agents are autonomous software systems designed to reason, plan, act and critique their work," stated Kari Briski, vice president of Generative AI Software Product Management at Nvidia. She emphasized that agents, like humans, require contextual understanding to break down complex requests, grasp user intent, and adapt in real time. The Llama Nemotron models are based on Meta's open source Llama models. Nvidia optimized these models by algorithmically pruning them to meet compute requirements while retaining accuracy. The company also utilized advanced post-training techniques with synthetic data, amounting to 360,000 H100 inference hours and 45,000 human annotation hours to improve reasoning capabilities. The training efforts have led to models that excel in benchmarks related to math, tool calling, instruction following, and conversational tasks. The Llama Nemotron family consists of three distinct models, each targeting different deployment scenarios: Nemotron Nano and Super models are currently available through NIM micro services and can be downloaded from AI.NVIDIA.com. The Ultra model is expected to be released soon. GM partners with Nvidia to revolutionize self-driving cars A significant feature of the Llama Nemotron models is the ability to toggle reasoning on or off. This emerging capability allows systems to avoid costly reasoning processes for straightforward queries. For instance, during a demonstration, Nvidia illustrated how the model could perform complex reasoning for a combinatorial problem while shifting to direct response mode for basic factual questions. Nvidia also unveiled the Agent AI-Q blueprint, an open-source framework designed to integrate AI agents with enterprise systems and data sources. "AI-Q is a new blueprint that enables agents to query multiple data types -- text, images, video -- and leverage external tools like web search and other agents," explained Briski. The framework aims to enhance observability and transparency for teams using connected agents, allowing developers to refine the system over time. The AI-Q blueprint is scheduled to be available in April. Nvidiа's Llama Nemotron models provide enterprises a chance to deploy reasoning-capable AI within their infrastructures, addressing data sovereignty and privacy issues commonly associated with cloud-only solutions. This initiative facilitates smoother deployment and management whether on-premises or in the cloud. The hybrid, conditional reasoning option allows organizations to prioritize thoroughness or speed, optimizing latency and compute for simpler tasks while supporting complex reasoning as necessary. As enterprises evolve towards more intricate AI applications, Nvidia's combination of efficient reasoning models and integration frameworks positions them for deploying sophisticated AI agents capable of multi-step logical problem-solving.
[5]
Nvidia's Llama Nemotron Models Will Let Businesses Build AI Agents
Llama Nemotron is available in Nano, Super and Ultra size variants Nvidia released a new family of artificial intelligence (AI) models on Tuesday at its GPU Technology Conference (GTC) 2025. Dubbed Llama Nemotron, these are the company's latest reasoning-focused large language models (LLMs) that are designed to offer a foundation for agentic AI workflows. The Santa Clara-based tech giant said these models were aimed at developers and enterprises to enable them to make advanced AI agents that can either work independently or as connected teams to perform complex tasks. The Llama Nemotron models are currently available via Nvidia's platform and Hugging Face. Nvidia Introduces New Reasoning-Focused AI Models In a newsroom post, the tech giant detailed the new AI models. The Llama Nemotron reasoning models are based on Meta's Llama 3 series models, with post-training enhancements added by Nvidia. The company highlighted that the family of AI models display improved capabilities in multistep mathematics, coding, reasoning, and complex decision-making. The company highlighted that the process improved the accuracy of the models by up to 20 percent compared to the based models. The inference speed is also said to have been improved by five times compared to similar-sized open-source reasoning models. Nvidia claimed that "the models can handle more complex reasoning tasks, enhance decision-making capabilities, and reduce operational costs for enterprises." With these advancements, the LLM can be used to build and power AI agents. Llama Nemotron reasoning models are available in three parameter sizes -- Nano, Super, and Ultra. The Nano model is best suited for on-device and edge-based tasks that require high accuracy. The Super variant is placed in the middle to offer high accuracy and throughput on a single GPU. Finally, the Ultra model is meant to be run on multi-GPU servers and offers agentic accuracy. The post-training of the reasoning models was done on the Nvidia DGX Cloud using curated synthetic data generated using the Nemotron platform as well as other open models. The tech giant is also making the tools, datasets, and post-training optimisation techniques used to develop the Llama Nemotron models available to the open-source community. Nvidia is also working with enterprise partners to bring the models to developers and businesses. These reasoning models and the NIM microservices can be accessed via Microsoft's Azure AI Foundry as well as an option via the Azure AI Agent Services. SAP is also using the models for its Business AI solutions and the AI copilot dubbed Joule, the company said. Other enterprises using Llama Nemotron models include ServiceNow, Accenture, and Deloitte. The Llama Nemotron Nano and Super models and NIM microservices are available for businesses and developers as an application programming interface (API) via Nvidia's platform as well as its Hugging Face listing. It is available with the permissive Nvidia Open Model License Agreement which allows both research and commercial usage.
Share
Share
Copy Link
Nvidia introduces Llama Nemotron, a family of open-source AI models with enhanced reasoning capabilities, designed to provide a foundation for building advanced AI agents. The models offer improved accuracy and inference speed, targeting various deployment scenarios from edge devices to multi-GPU servers.
Nvidia has unveiled its new family of open-source AI models, Llama Nemotron, at the GTC 2025 event. These models are designed to provide advanced reasoning capabilities for agentic AI workloads, building upon the Nvidia Nemotron models announced earlier this year 12.
The Llama Nemotron models, based on Meta's open-source Llama models, have been optimized through algorithmic pruning and sophisticated post-training techniques. Nvidia reports that these enhancements have resulted in:
The Llama Nemotron family includes three models targeting different deployment scenarios:
Nano and Super models are currently available as NVIDIA NIM microservices and can be downloaded from AI.NVIDIA.com, while the Ultra model is coming soon 1.
Nvidia has announced collaborations with leading companies to integrate Llama Nemotron models into various enterprise solutions:
To support the adoption of advanced reasoning in collaborative AI systems, Nvidia has introduced new agentic AI building blocks as part of the NVIDIA AI Enterprise software platform:
The open nature of Llama Nemotron models addresses data sovereignty and privacy concerns, allowing businesses to deploy reasoning-capable AI within their own infrastructure. The hybrid, conditional reasoning approach enables organizations to optimize for either thoroughness or speed, potentially reducing operational costs 14.
As enterprise AI evolves beyond simple applications to more complex reasoning tasks, Nvidia's combined offering of efficient reasoning models and integration frameworks positions companies to deploy sophisticated AI agents capable of handling multi-step logical problems while maintaining deployment flexibility and cost efficiency 15.
The Llama Nemotron Nano and Super models and NIM microservices are available as a hosted API from build.nvidia.com and Hugging Face. Access for development, testing, and research is free for members of the NVIDIA Developer Program. The models are released under the Nvidia Open Model License Agreement, allowing for both research and commercial usage 25.
Reference
[2]
[3]
[4]
[5]
NVIDIA announces new Llama Nemotron and Cosmos Nemotron model families designed to enhance AI agent capabilities and boost enterprise productivity across various applications.
4 Sources
4 Sources
NVIDIA quietly released a new open-source AI model, Llama-3.1-Nemotron-70B-Instruct, which has reportedly outperformed leading models from OpenAI and Anthropic in benchmark tests, signaling a shift in NVIDIA's AI strategy.
6 Sources
6 Sources
NVIDIA introduces new AI models and blueprints for building agentic AI applications, partnering with leading tech companies to simplify the development and deployment of AI agents for enterprises.
7 Sources
7 Sources
NVIDIA introduces AI Agent Blueprints, a new tool designed to simplify the creation of AI-powered enterprise applications. This release aims to democratize AI development and enable businesses to build custom AI experiences efficiently.
3 Sources
3 Sources
Nvidia releases new NIM microservices as part of NeMo Guardrails to improve security, control, and performance of AI agents, addressing critical concerns in enterprise AI adoption.
5 Sources
5 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved