8 Sources
8 Sources
[1]
Nvidia unveils new Cosmos world models, infra for robotics and physical uses | TechCrunch
Nvidia on Monday unveiled a set of new world AI models, libraries and other infrastructure for robotics developers, most notable of which is Cosmos Reason, a 7-billion-parameter "reasoning" vision language model for physical AI applications and robots. Also joining the existing batch of Cosmos world models are Cosmos Transfer-2, which can accelerate synthetic data generation from 3D simulation scenes or spatial control inputs, and a distilled version of Cosmos Transfers that is more optimized for speed. During its announcement at the SIGGRAPH conference on Monday, Nvidia noted that these models are meant to be used to create synthetic text, image and video data sets for training robots and AI agents. Cosmos Reason, per Nvidia, allows robots and AI agents to "reason" thanks to its memory and physics understanding, which lets it "serve as a planning model to reason what steps an embodied agent might take next." The company says it can be used for data curation, robot planning, and video analytics. The company also unveiled new neural reconstruction libraries, which includes one for a rendering technique that lets developers simulate the real world in 3D using sensor data. This rendering capability is also being integrated into open source simulator CARLA, a popular developer platform. There's even an update to the Omniverse software development kit. There's new servers for robotics workflows, too. The Nvidia RTX Pro Blackwell Servers offers a single architecture for robotic development workloads, while Nvidia DGX Cloud is a cloud-based management platform. These announcements come as the semiconductor giant is pushing further into robotics as it looks toward the next big use case for its AI GPUs beyond AI data centers.
[2]
CrowdStrike, Uber, Zoom Among Industry Pioneers Building Smarter Agents With NVIDIA Nemotron and Cosmos Reasoning Models for Enterprise and Physical AI Applications
Open reasoning models provide faster and extended thinking to generate smarter outcomes for AI agents across customer service, cybersecurity, manufacturing, logistics and robotics. AI agents are poised to deliver as much as $450 billion from revenue gains and cost savings by 2028, according to Capgemini. Developers building these agents are turning to higher-performing reasoning models to improve AI agent platforms and physical AI systems. At SIGGRAPH, NVIDIA today announced an expansion of two model families with reasoning capabilities -- NVIDIA Nemotron and NVIDIA Cosmos -- that leaders across industries are using to drive productivity via teams of AI agents and humanoid robots. CrowdStrike, Uber, Magna, NetApp and Zoom are among some of the enterprises tapping into these model families. New NVIDIA Nemotron Nano 2 and Llama Nemotron Super 1.5 models offer the highest accuracy in their size categories for scientific reasoning, math, coding, tool-calling, instruction-following and chat. These new models give AI agents the power to think more deeply and work more efficiently -- exploring broader options, speeding up research and delivering smarter results within set time limits. Think of the model as the brain of an AI agent -- it provides the core intelligence. But to make that brain useful for a business, it must be embedded into an agent that understands specific workflows, in addition to industry and business jargon, and operates safely. NVIDIA helps enterprises bridge that gap with leading libraries and AI blueprints for onboarding, customizing and governing AI agents at scale. Cosmos Reason is a new reasoning vision language model (VLM) for physical AI applications that excels in understanding how the real world works, using structured reasoning to understand concepts like physics, object permanence and space-time alignment. Cosmos Reason is purpose-built to serve as the reasoning backbone to a robot vision language action (VLA) model, or critique and caption training data for robotics and autonomous vehicles, and equip runtime visual AI agents with spatial-temporal understanding and reasoning of physical operations, like in factories or cities. Nemotron: Highest Accuracy and Efficiency for Agentic Enterprise AI As enterprises develop AI agents to tackle complex, multistep tasks, models that can provide strong reasoning accuracy with efficient token generation enable intelligent, autonomous decision-making at scale. NVIDIA Nemotron is a family of advanced open reasoning models that use leading models, NVIDIA-curated open datasets and advanced AI techniques to provide an accurate and efficient starting point for AI agents. The latest Nemotron models deliver leading efficiency in three ways: a new hybrid model architecture, compact quantized models and a configurable thinking budget that provides developers with control over token generation, resulting in 60% lower reasoning costs. This combination lets the models reason more deeply and respond faster, without needing more time or computing power. This means better results at a lower cost. Nemotron Nano 2 provides as much as 6x higher token generation compared with other leading models of its size. Llama Nemotron Super 1.5 achieves leading performance and the highest reasoning accuracy in its class, empowering AI agents to reason better, make smarter decisions and handle complex tasks independently. It's now available in NVFP4, or 4-bit floating point, which delivers as much as 6x higher throughput on NVIDIA B200 GPUs compared with NVIDIA H100 GPUs. The chart above shows the Nemotron model delivers top reasoning accuracy in the same timeframe and on the same compute budget, delivering the highest accuracy per dollar. Along with the two new Nemotron models, NVIDIA is also announcing its first open VLM training dataset -- Llama Nemotron VLM dataset v1 -- with 3 million samples of optical character recognition, visual QA and captioning data that power the previously released Llama 3.1 Nemotron Nano VL 8B model. In addition to the accuracy of the reasoning models, agents also rely on retrieval-augmented generation to fetch the latest and most relevant information from connected data across disparate sources to make informed decisions. The recently released Llama 3.2 NeMo Retriever embedding model tops three visual document retrieval leaderboards -- ViDoRe V1, ViDoRe V2 and MTEB VisualDocumentRetrieval -- for boosting agentic system accuracy. Using these reasoning and information retrieval models, a deep research agent built using the AI-Q NVIDIA Blueprint is currently No. 1 for open and portable agents on DeepResearch Bench. NVIDIA NeMo and NVIDIA NIM microservices support the entire AI agent lifecycle -- from development and deployment to monitoring and optimization of the agentic systems. Cosmos Reason: A Breakthrough in Physical AI VLMs marked a breakthrough for computer vision and robotics, empowering machines to identify objects and patterns. However, nonreasoning VLMs lack the ability to understand and interact with the real world -- meaning they can't handle ambiguity or novel experiences, nor solve complex multistep tasks. NVIDIA Cosmos Reason is a new open, customizable, 7-billion-parameter reasoning VLM for physical AI and robotics. Cosmos Reason lets robots and vision AI agents reason like humans, using prior knowledge, physics understanding and common sense to understand and act in the physical world. Cosmos Reason enables advanced capabilities across robotics and physical AI applications such as training data critiquing and captioning, robot decision-making and video analytics AI agents. It can help automate the curation and annotation of large, diverse training datasets, accelerating the development of high-accuracy AI models. It can also serve as a sophisticated reasoning engine for robot planning, parsing complex instructions into actionable steps for VLA models, even in new environments. It also powers video analytics AI agents built on the NVIDIA Blueprint for video search and summarization (VSS), enabled by the NVIDIA Metropolis platform, gleaning valuable insights from massive volumes of stored or live video data. These visually perceptive and interactive AI agents can help streamline operations in factories, warehouses, retail stores, airports, traffic intersections and more by spotting anomalies. NVIDIA's robotics research team uses Cosmos Reason for data filtration and curation, and as the "System 2" reasoning VLM behind VLA models such as the next versions of NVIDIA Isaac GR00T NX. Now Serving: NVIDIA Reasoning Models for AI Agents and Robots Everywhere Diverse enterprises and consulting leaders are adopting NVIDIA's latest reasoning models. Leaders spanning cybersecurity to telecommunications are among those working with Nemotron to build enterprise AI agents. Zoom plans to harness Nemotron reasoning models with Zoom AI Companion to make decisions and manage multistep tasks to take action for users across Zoom Meetings, Zoom Chat and Zoom documents. CrowdStrike is testing Nemotron models to enable its Charlotte AI agents to write queries on the CrowdStrike Falcon platform. Amdocs is using NVIDIA Nemotron models in its amAIz Suite to drive AI agents to handle complex, multistep automation spanning care, sales, network and customer support. EY is adopting Nemotron Nano 2, given its high throughput, to support agentic AI in large organizations for tax, risk management and finance use cases. NetApp is currently testing Nemotron reasoning models so that AI agents can search and analyze business data DataRobot is working with Nemotron models for its Agent Workforce Platform for end-to-end agent lifecycle management. Tabnine is working with Nemotron models for suggesting and automating coding tasks on behalf of developers. Automation Anywhere, CrewAI and Dataiku are among the additional agentic AI software developers integrating Nemotron models into their platforms. Leading companies across transportation, safety and AI intelligence are using Cosmos Reason to advance autonomous driving, video analytics, and road and workplace safety. Uber is exploring Cosmos Reason to analyze autonomous vehicle behavior. In addition, Uber is post-training Cosmos Reason to summarize visual data and analyze scenarios like pedestrians walking across highways to perform quality analysis and inform autonomous driving behavior. Cosmos Reason can also serve as the brain of autonomous vehicles. It lets robots interpret environments and, given complex commands, break them down into tasks and execute them using common sense, even in unfamiliar environments. Centific is testing Cosmos Reason to enhance its AI-powered video intelligence platform. The VLM enables the platform to process complex video data into actionable insights, helping reduce false positives and improve decision-making efficiency. VAST is advancing real-time urban intelligence using NVIDIA Cosmos Reason with its AI operating system to process massive video streams at scale. With the VSS Blueprint, VAST can build agents that can identify incidents and trigger responses, turning video streams and metadata into actionable, proactive public safety tools. Ambient.ai is working with Cosmos Reason's temporal, physics-aware reasoning, to enable automated detection of missing personal protection equipment and monitoring of hazardous conditions, helping enhance environmental health and safety across construction, manufacturing, logistics and other industrial settings. Magna is developing with Cosmos Reason as part of its City Delivery Platform -- a fully autonomous, low-cost solution for instant delivery -- to help vehicles adapt more quickly to new cities. The model adds world understanding to the vehicles' long-term trajectory planning. These models are expected to be available as NVIDIA NIM microservices for secure, reliable deployment on any NVIDIA-accelerated infrastructure for maximum privacy and control. They are planned to be available soon through Amazon Bedrock and Amazon SageMaker AI for Nemotron models, as well as through Azure AI Foundry, Oracle Data Science Platform and Google Vertex AI. Try Cosmos Reason on build.nvidia.com or download it from Hugging Face or GitHub. Nemotron Nano 2 and Llama Nemotron Super 1.5 (NVFP4) will be available soon for download. Meanwhile, learn more about Nemotron models and download previous versions. Download the Llama Nemotron VLM Dataset v1 from Hugging Face. Watch the NVIDIA Research special address at SIGGRAPH and learn more about how graphics and simulation innovations come together to drive industrial digitalization by joining NVIDIA at the conference, running through Thursday, Aug. 14. See notice regarding software product information.
[3]
NVIDIA Research Shapes Physical AI
AI and graphics research breakthroughs in neural rendering, 3D generation and world simulation power robotics, autonomous vehicles and content creation. Physical AI -- the engine behind modern robotics, self-driving cars and smart spaces -- relies on a mix of neural graphics, synthetic data generation, physics-based simulation, reinforcement learning and AI reasoning. It's a combination well-suited to the collective expertise of NVIDIA Research, a global team that for nearly 20 years has advanced the now-converging fields of AI and graphics. That's why at SIGGRAPH, the premier computer graphics conference taking place in Vancouver through Thursday, Aug. 14, NVIDIA Research leaders will deliver a special address highlighting the graphics and simulation innovations enabling physical and spatial AI. "AI is advancing our simulation capabilities, and our simulation capabilities are advancing AI systems," said Sanja Fidler, vice president of AI research at NVIDIA. "There's an authentic and powerful coupling between the two fields, and it's a combination that few have." At SIGGRAPH, NVIDIA is unveiling new software libraries for physical AI -- including NVIDIA Omniverse NuRec 3D Gaussian splatting libraries for large-scale world reconstruction, updates to the NVIDIA Metropolis platform for vision AI as well as NVIDIA Cosmos and NVIDIA Nemotron reasoning models. Cosmos Reason is a new reasoning vision language model for physical AI that enables robots and vision AI agents to reason like humans using prior knowledge, physics understanding and common sense. Many of these innovations are rooted in breakthroughs by the company's global research team, which is presenting over a dozen papers at the show on advancements in neural rendering, real-time path tracing, synthetic data generation and reinforcement learning -- capabilities that will feed the next generation of physical AI tools. How Physical AI Unites Graphics, AI and Robotics Physical AI development starts with the construction of high-fidelity, physically accurate 3D environments. Without these lifelike virtual environments, developers can't train advanced physical AI systems such as humanoid robots in simulation, because the skills the robots would learn in virtual training wouldn't translate well enough to the real world. Picture an agricultural robot using the exact amount of pressure to pick peaches off trees without bruising them, or a manufacturing robot assembling microscopic electronic components on a machine where every millimeter matters. "Physical AI needs a virtual environment that feels real, a parallel universe where the robots can safely learn through trial and error," said Ming-Yu Liu, vice president of research at NVIDIA. "To build this virtual world, we need real-time rendering, computer vision, physical motion simulation, 2D and 3D generative AI, as well as AI reasoning. These are the things that NVIDIA Research has spent nearly two decades to be good at." NVIDIA's legacy of breakthrough research in ray tracing and real-time computer graphics, dating back to the research organization's inception in 2006, plays a critical role in enabling the realism that physical AI simulations demand. Much of that rendering work, too, is powered by AI models -- a field known as neural rendering. "Our core rendering research fuels the creation of true-to-reality virtual words used to train advanced physical AI systems, while AI is in turn helping us create those 3D worlds from images," said Aaron Lefohn, vice president of graphics research and head of the Real-Time Graphics Research group at NVIDIA. "We're now at a point where we can take pictures and videos -- an accessible form of media that anyone can capture -- and rapidly reconstruct them into virtual 3D environments."
[4]
Making Safer Spaces: NVIDIA and Partners Bring Physical AI to Cities and Industrial Infrastructure
Advanced blueprints, vision language models and synthetic data generation extensions accelerated by NVIDIA RTX PRO 6000 Blackwell GPUs enhance productivity and improve safety across environments. Physical AI is becoming the foundation of smart cities, facilities and industrial processes across the globe. NVIDIA is working with companies including Accenture, Avathon, Belden, DeepHow, Milestone Systems and Telit Cinterion to enhance operations across the globe with physical AI-based perception and reasoning. The continuous loop of simulating, training and deploying physical AI offers sophisticated industrial automation capabilities, making cities and infrastructure safer, smarter and more efficient. For example, physical AI applications can automate potentially dangerous tasks for workers, such as working with heavy machinery. Physical AI can also improve transportation services and public safety, detect defective products in factories and more. The need for this is greater than ever. The numbers tell the story: Infrastructure that can perceive, reason and act relies on video sensors and the latest vision AI capabilities. Using the NVIDIA Metropolis platform -- which simplifies the development, deployment and scaling of video analytics AI agents and services from the edge to the cloud -- developers can build visual perception into their facilities faster to enhance productivity and improve safety across environments. Below are five leading companies advancing physical AI -- and five key NVIDIA Metropolis updates, announced today at the SIGGRAPH computer graphics conference, making such advancements possible. Five Companies Advancing Physical AI Global professional services company Accenture is collaborating with Belden, a leading provider of complete connection solutions, to enhance worker safety by creating smart virtual fences that factories can place around large robots to prevent accidents with human operators. The smart virtual fence is a physical AI safety system that uses an OpenUSD-based digital twin and physics-grounded simulation to model complex industrial environments. Using computer vision-based mapping and 3D spatial intelligence, the system is adaptive to increased variability in the dynamic human-robot interactions that occur in a modern shopfloor environment. Accenture taps into the NVIDIA Omniverse platform and Metropolis to build and simulate these smart fences. With Omniverse, Accenture created a digital twin of a robot arm and workers moving in a space. And with Metropolis, the company trained its AI models and deployed them at the edge with video ingestion and the NVIDIA DeepStream software development kit (SDK)'s real-time inference capabilities. Avathon, an industrial automation platform provider, uses the NVIDIA Blueprint for video search and summarization (VSS), part of NVIDIA Metropolis, to provide manufacturing and energy facilities with real-time insights that improve operational efficiency and worker safety. Reliance British Petroleum Mobility Limited, a leader in India's fuel and mobility sector, used the Avathon video intelligence product during the construction of its gas stations to achieve higher standards of safety compliance, a reduction in safety noncompliance incidents and higher productivity by saving thousands of work hours. DeepHow has developed a "Smart Know-How Companion" for employees in manufacturing and other industries. The companion uses the Metropolis VSS blueprint to transform key workflows into bite-sized, multilingual videos and digital instructions, improving onboarding, safety and floor operator efficiency. Facing upskilling needs and retiring skilled workers, beverage company Anheuser-Busch InBev turned to the DeepHow platform to convert standard operating procedures into easy-to-understand visual guides. This has slashed onboarding time by 80%, boosted training consistency and improved long-term knowledge retention for employees. Milestone Systems, which offers one of the world's largest platforms for managing IP video sensor data in complex industrial and city deployments, is creating the world's largest real-world computer vision data library through its platform, Project Hafnia. Among its capabilities, the platform provides physical AI developers with access to customized vision language models (VLMs). Tapping NVIDIA NeMo Curator, Milestone Systems built a VLM fine-tuned for intelligent transportation systems for use within the VSS blueprint to help develop AI agents that better manage city roadways. Milestone Systems is also looking to use the new open, customizable NVIDIA Cosmos Reason VLM for physical AI. Internet-of-things company Telit Cinterion has integrated NVIDIA TAO Toolkit 6 into its AI-powered visual inspection platform, which uses vision foundation models like FoundationPose, alongside other NVIDIA models, to support multimodal AI and deliver high-performance inferencing. TAO brings low-code AI capabilities to the Telit platform, enabling manufacturers to quickly develop and deploy accurate, custom AI models for defect detection and quality control. Five NVIDIA Metropolis Updates for Physical AI Key updates to NVIDIA Metropolis are enhancing developers' capabilities to build physical AI applications more quickly and easily: Cosmos Reason VLM The latest version of Cosmos Reason -- NVIDIA's advanced open, customizable, 7-billion-parameter reasoning VLM for physical AI -- enables contextual video understanding, temporal event reasoning for Metropolis use cases. Its compact size makes it easy to deploy from edge to cloud and ideal for automating traffic monitoring, public safety, visual inspection and intelligent decision-making. VSS Blueprint 2.4 VSS 2.4 makes it easy to quickly augment existing vision AI applications with Cosmos Reason and deliver powerful new features to smart infrastructure. An expanded set of application programming interfaces in the blueprint offers users direct more flexibility in choosing specific VSS components and capabilities to augment computer vision pipelines with generative AI. New Vision Foundation Models The NVIDIA TAO Toolkit includes a new suite of vision foundation models, along with advanced fine-tuning methods, self-supervised learning and knowledge distillation capabilities, to optimize deployment of physical AI solutions across edge and cloud environments. The NVIDIA DeepStream SDK includes a new Inference Builder to enable seamless deployment of TAO 6 models. Companies around the world -- including Advex AI, Instrumental AI and Spingence -- are experimenting with these new models and NVIDIA TAO to build intelligent solutions that optimize industrial operations and drive efficiency. NVIDIA Isaac Sim Extensions New extensions in the NVIDIA Isaac Sim reference application help solve common challenges in vision AI development -- such as limited labeled data and rare edge-case scenarios. These tools simulate human and robot interactions, generate rich object-detection datasets, and create incident-based scenes and image-caption pairs to train VLMs, accelerating development and improving AI performance in real-world conditions. Expanded Hardware Support All of these Metropolis components can now run on NVIDIA RTX PRO 6000 Blackwell GPUs, the NVIDIA DGX Spark desktop supercomputer and the NVIDIA Jetson Thor platform for physical AI and humanoid robotics -- so users can develop and deploy from the edge to the cloud. Cosmos Reason 1 and NVIDIA TAO 6.0 are now available for download. Sign up to be alerted when VSS 2.4, the Cosmos Reason VLM fine-tuning update and NVIDIA DeepStream 8.0 become available. Watch the NVIDIA Research special address at SIGGRAPH and learn more about how graphics and simulation innovations come together to drive industrial digitalization by joining NVIDIA at the conference, running through Thursday, Aug. 14. See notice regarding software product information.
[5]
NVIDIA Opens Portals to World of Robotics With New Omniverse Libraries, Cosmos Physical AI Models and AI Computing Infrastructure
"Computer graphics and AI are converging to fundamentally transform robotics," said Rev Lebaredian, vice president of Omniverse and simulation technologies at NVIDIA. "By combining AI reasoning with scalable, physically accurate simulation, we're enabling developers to build tomorrow's robots and autonomous vehicles that will transform trillions of dollars in industries." New NVIDIA Omniverse Libraries Advance Applications for World Composition New NVIDIA Omniverse software development kits (SDKs) and libraries are now available for building and deploying industrial AI and robotics simulation applications. Omniverse NuRec rendering is now integrated in CARLA, a leading open-source simulator used by over 150,000 developers. Autonomous vehicle (AV) toolchain leader Foretellix is integrating NuRec, NVIDIA Omniverse Sensor RTX™ and Cosmos Transfer to enhance its scalable synthetic data generation with physically accurate scenarios. Voxel51's data engine for visual and multimodal AI, FiftyOne, supports NuRec to ease data preparation for reconstructions. FiftyOne is used by customers such as Ford and Porsche. Boston Dynamics, Figure AI, Hexagon, RAI Institute, Lightwheel and Skild AI are adopting Omniverse libraries, Isaac Sim and Isaac Lab to accelerate their AI robotics development, while Amazon Devices & Services is using them to power a new manufacturing solution. Cosmos Advances World Generation for Robotics Cosmos WFMs, downloaded over 2 million times, let developers generate diverse data for training robots at scale using text, image and video prompts. New models announced at SIGGRAPH deliver major advances in synthetic data generation speed, accuracy, language support and control: Lightwheel, Moon Surgical and Skild AI are using Cosmos Transfer to accelerate physical AI training by simulating diverse conditions at scale. Cosmos Reason Breaks Through World Understanding Since the introduction of OpenAI's CLIP model, vision language models (VLMs) have transformed computer-vision tasks like object and pattern recognition. However, they have not yet been able to solve multistep tasks nor handle ambiguity or novel experiences. NVIDIA Cosmos Reason -- a new open, customizable, 7-billion-parameter reasoning VLM for physical AI and robotics -- lets robots and vision AI agents reason like humans, using prior knowledge, physics understanding and common sense to understand and act in the real world. Cosmos Reason can be used for robotics and physical AI applications including: NVIDIA's robotics and NVIDIA DRIVE™ teams are using Cosmos Reason for data curation and filtering, annotation and VLA post-training. Uber is using it to annotate and caption AV training data. Magna is developing with Cosmos Reason as part of its City Delivery platform -- a fully autonomous, low-cost solution for instant delivery -- to help vehicles adapt more quickly to new cities. Cosmos Reason adds world understanding to the vehicles' long-term trajectory planner. VAST Data, Milestone Systems and Linker Vision are adopting Cosmos Reason to automate traffic monitoring, improve safety and enhance visual inspection in cities and industrial settings. New NVIDIA AI Infrastructure Powers Robotics Workloads Anywhere To enable developers to take full advantage of these advanced technologies and software libraries, NVIDIA announced AI infrastructure designed for the most demanding workloads. Accelerating the Developer Ecosystem To help robotics and physical AI developers advance 3D and simulation technology adoption, NVIDIA also announced:
[6]
Nvidia debuts next-gen agentic AI and reasoning robotic models at SIGGRAPH 2025 - SiliconANGLE
Nvidia debuts next-gen agentic AI and reasoning robotic models at SIGGRAPH 2025 Nvidia Corp. announced today that it's expanding its offerings of smarter AI models, physical intelligence for robotics and powerful enterprise AI servers. Leading the news at the ACM SIGGRAPH 2025 computer graphics conference in Vancouver, Nvidia unveiled that the Nvidia RTX Pro 6000 Blackwell Server Edition GPU, a graphics processing unit designed for servers, is now coming to enterprise servers. This new addition will allow organizations to run large language models at high speed and these 2U form-factor rack-mountable servers will use the Blackwell architecture to deliver high-performance AI inference workloads. "AI is reinventing computing for the first time in 60 years -- what started in the cloud is now transforming the architecture of on-premises data centers," said Jensen Huang, founder and chief executive of Nvidia. The new Blackwell RTX Pro Servers bring GPU acceleration to traditional CPU-based workloads -- including data analytics, simulation, video processing and graphics rendering -- enabling up to 45 times better performance. According to Nvidia, this results in 18 times higher energy efficiency and significantly lower cost compared with CPU-only systems. Nvidia is partnering with Cisco Systems Inc., Dell Technologies Inc., Hewlett Packard Enterprise Co., Lenovo Group Ltd. and Super Micro Computer, Inc. to offer the new servers in a variety of configurations. "With the world's leading server providers, we're making Nvidia Blackwell RTX Pro Servers the standard platform for enterprise and industrial AI," added Huang. Artificial intelligence agents are forming the foundation of a growing market, as more organizations adopt their autonomous capabilities. These agents can reason through complex tasks and plan across longer time horizons. Nvidia announced an expansion of its Nemotron model family, introducing two new models with advanced reasoning capabilities for building smarter AI agents: Nemotron Nano 2 and Llama Nemotron Super 1.5. The company said these models deliver high accuracy for their size categories in areas such as scientific reasoning, coding, tool use, instruction following and chat. Designed to empower agents with deeper cognitive abilities, the models help AI systems explore options, weigh decisions and deliver results within defined constraints. Nemotron Nano 2 achieves up to six times higher token generation throughput compared to other models in its class. Llama Nemotron Super 1.5 offers top-tier performance and leads in reasoning accuracy, making it suitable for handling complex enterprise tasks. Nvidia is working with enterprise partners and consulting leaders to deploy these reasoning models. These include Zoom Video Communications Inc., which plans to integrate Nemotron into its AI Companion, as well as CrowdStrike Holdings Inc., which is testing the models in its Charlotte AI agents for writing cybersecurity queries, and Ernst & Young Global Ltd., adopting Nemotron Nano 2 to empower agents analyzing business data. Nvidia is empowering robotics and machines to "see" and reason about the world with new AI models that combine both the ability to ingest visual information and additionally think about that information. Vision language models, or VLMs, provided computer vision for robotics, allowing them to understand and interact with the world, but they lacked the ability to think about their actions. Today, Nvidia announced Cosmos Reason, a new open, customizable 7 billion-parameter reasoning VLM for physical AI vision agents and robotics. It allows robots and vision agents to think about what they see similar to humans and plan about what's in a scene using intelligence such as physics knowledge and common sense from training data. The company said it can help automate the curation and annotation of large, diverse training datasets, accelerating the development of high-accuracy AI models. It added that it can also serve as a sophisticated reasoning engine for robot planning, parsing complex instructions into steps for VLA models, even in new environments. These new advances form the foundation for smart cities, facilities and industrial processes. Nvidia said it's working with companies that include Accenture plc, Belden Inc., DeepHow Inc., Milestone Systems A/C and Telit Cinterion Ltd. to increase productivity and safety at their locations using physical AI-based perception and reasoning. Infrastructure that can perceive, reason and react that relies on sensors and vision AI using the Nvidia Metropolis platform, which makes it possible to develop and deploy video analytics AI agents and services for campuses and facilities.
[7]
Nvidia Unveils A.I. Models to Give Robots Human-Like Reasoning Skills
Nvidia moves beyond GPUs toward physical A.I. powering humanoids, self-driving cars and robotics. The A.I. boom has helped make Nvidia the world's most valuable publicly traded company with a market cap of $4.3 trillion. But CEO Jensen Huang envisions the company's future to center around physical A.I. At its SIGGRAPH conference on Aug. 11, Nvidia unveiled a suite of A.I. models aimed at advancing physical A.I. systems, including humanoid robots and self-driving cars. The centerpiece is Cosmos Reason, a vision-language-action (VLA) model that the company says can give robots human-like reasoning abilities. Sign Up For Our Daily Newsletter Sign Up Thank you for signing up! By clicking submit, you agree to our <a href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime. See all of our newsletters "Computer graphics and A.I. are converging to fundamentally transform robotics," said Rev Lebaredian, Nvidia's vice president of Omniverse and simulation technologies, in a statement. "By combining A.I. reasoning with scalable, physically accurate simulation, we're enabling developers to build tomorrow's robots and autonomous vehicles that will transform trillions of dollars in industries." Cosmos Reason is a 7-billion-parameter model that boasts an understanding of prior knowledge, physics and common sense in how to operate amongst the real world. When applied to physical A.I. systems, it will act "as the brain for deliberate, methodical decision-making," said Nvidia in a blog post. The model can also automate dataset curation and annotation, a capability already being used by Uber as well as Nvidia's own robotics and self-driving teams. Magna, a mobility tech company, is incorporating Cosmos Reason to help vehicles adapt faster to new environments. Other adopters include VAST Data, Milestone Systems and Linker Vision, which are using it for traffic monitoring and safety improvements in urban and industrial settings. While companies like OpenAI have also developed VLA models for robotics, Nvidia claims Cosmos Reason stands out for its ability to handle multistep tasks, adapt to unfamiliar situations and navigate ambiguity. In addition to Cosmos Reason, Nvidia earlier this week also introduced Cosmos Transfer-2, an upcoming model that will let robot developers create photorealistic data from 3D simulation scenes and spatial control inputs. Its distilled version, Cosmos Transfer, offers faster performance and has already been adopted by robotics firm Lightwheel and medical robotics company Moon Surgical. These releases are part of Huang's broader plan to push Nvidia beyond its GPU roots. For now, the physical A.I. segment remains small. Nvidia's automotive and robotics division, despite growing by 72 percent in revenue between February and April to $567 million, only accounted for around 1 percent of its total quarterly sales. Evn so, Huang, who declared in June that Nvidia "stopped thinking of ourselves as a chip company long ago," believes physical A.I. will one day define the company's future.
[8]
NVIDIA Opens Portals to World of Robotics With New Omniverse Libraries, Cosmos Physical AI Models and AI Computing Infrastructure
Powered by new RTX PRO™ Servers and DGX™ Cloud, the libraries and models let developers anywhere develop physically accurate digital twins, capture and reconstruct the real world in simulation, generate synthetic data for training physical AI models and build AI agents that understand the physical world. "Computer graphics and AI are converging to fundamentally transform robotics," said Rev Lebaredian, vice president of Omniverse and simulation technologies at . "By combining AI reasoning with scalable, physically accurate simulation, we're enabling developers to build tomorrow's robots and autonomous vehicles that will transform trillions of dollars in industries." New Omniverse Libraries Advance Applications for World Composition New Omniverse software development kits (SDKs) and libraries are now available for building and deploying industrial AI and robotics simulation applications. Omniverse NuRec rendering is now integrated in CARLA, a leading open-source simulator used by over 150,000 developers. Autonomous vehicle (AV) toolchain leader Foretellix is integrating NuRec, Omniverse Sensor RTX™ and Cosmos Transfer to enhance its scalable synthetic data generation with physically accurate scenarios. Voxel51's data engine for visual and multimodal AI, FiftyOne, supports NuRec to ease data preparation for reconstructions. FiftyOne is used by customers such as and Porsche. Boston Dynamics, Figure AI, Hexagon, , Lightwheel and Skild AI are adopting Omniverse libraries, and to accelerate their AI robotics development, while Amazon Devices & Services is using them to power a new manufacturing solution. Cosmos Advances World Generation for Robotics Cosmos WFMs, downloaded over 2 million times, let developers generate diverse data for training robots at scale using text, image and video prompts. New models announced at SIGGRAPH deliver major advances in synthetic data generation speed, accuracy, language support and control: Lightwheel, Moon Surgical and Skild AI are using Cosmos Transfer to accelerate physical AI training by simulating diverse conditions at scale. Cosmos Reason Breaks Through World Understanding Since the introduction of OpenAI's CLIP model, vision language models (VLMs) have transformed computer-vision tasks like object and pattern recognition. However, they have not yet been able to solve multistep tasks nor handle ambiguity or novel experiences. Cosmos Reason -- a new open, customizable, 7-billion-parameter reasoning VLM for physical AI and robotics -- lets robots and vision AI agents reason like humans, using prior knowledge, physics understanding and common sense to understand and act in the real world. can be used for robotics and physical AI applications including: NVIDIA's robotics and DRIVE™ teams are using for data curation and filtering, annotation and VLA post-training. Uber is using it to annotate and caption AV training data. Magna is developing with as part of its City Delivery platform -- a fully autonomous, low-cost solution for instant delivery -- to help vehicles adapt more quickly to new cities. adds world understanding to the vehicles' long-term trajectory planner. VAST Data, Milestone Systems and Linker Vision are adopting to automate traffic monitoring, improve safety and enhance visual inspection in cities and industrial settings. New AI Infrastructure Powers Robotics Workloads Anywhere To enable developers to take full advantage of these advanced technologies and software libraries, announced AI infrastructure designed for the most demanding workloads. Accelerating the Developer Ecosystem To help robotics and physical AI developers advance 3D and simulation technology adoption, also announced: Certain statements in this press release including, but not limited to, statements as to: computer graphics and AI converging to fundamentally transform robotics; by combining AI reasoning with scalable, physically accurate simulation, enabling developers to build tomorrow's robots and autonomous vehicles that will transform trillions of dollars in industries; the benefits, impact, performance, and availability of NVIDIA's products, services, and technologies; expectations with respect to NVIDIA's third party arrangements, including with its collaborators and partners; expectations with respect to technology developments; and other statements that are not historical facts are forward-looking statements within the meaning of Section 27A of the Securities Act of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as amended, which are subject to the "safe harbor" created by those sections based on management's beliefs and assumptions and on information currently available to management and are subject to risks and uncertainties that could cause results to be materially different than expectations. Important factors that could cause actual results to differ materially include: global economic and political conditions; NVIDIA's reliance on third parties to manufacture, assemble, package and test NVIDIA's products; the impact of technological development and competition; development of new products and technologies or enhancements to NVIDIA's existing product and technologies; market acceptance of NVIDIA's products or NVIDIA's partners' products; design, manufacturing or software defects; changes in consumer preferences or demands; changes in industry standards and interfaces; unexpected loss of performance of NVIDIA's products or technologies when integrated into systems; and changes in applicable laws and regulations, as well as other factors detailed from time to time in the most recent reports files with the , or , including, but not limited to, its annual report on Form 10-K and quarterly reports on Form 10-Q. Copies of reports filed with the are posted on the company's website and are available from without charge. These forward-looking statements are not guarantees of future performance and speak only as of the date hereof, and, except as required by law, disclaims any obligation to update these forward-looking statements to reflect future events or circumstances. Many of the products and features described herein remain in various stages and will be offered on a when-and-if-available basis. The statements above are not intended to be, and should not be interpreted as a commitment, promise, or legal obligation, and the development, release, and timing of any features or functionalities described for our products is subject to change and remains at the sole discretion of . will have no liability for failure to deliver or delay in the delivery of any of the products, features or functions set forth herein. © 2025 . All rights reserved. , the logo, Cosmos, DGX, DRIVE, Isaac Sim, Omniverse, Omniverse Cloud Sensor RTX, RTX and RTX PRO are trademarks and/or registered trademarks of in the and/or other countries. Other company and product names may be trademarks of the respective companies with which they are associated. Features, pricing, availability, and specifications are subject to change without notice. A photo accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/e81778ef-d0e2-4944-9889-02609ac378cd
Share
Share
Copy Link
NVIDIA announces new AI models and infrastructure for robotics and physical AI applications, including Cosmos Reason, a 7-billion-parameter reasoning vision language model, and updates to the Nemotron model family.
NVIDIA has announced a significant advancement in AI technology with the introduction of Cosmos Reason, a 7-billion-parameter reasoning vision language model (VLM) designed specifically for physical AI applications and robotics
1
. This new model, unveiled at the SIGGRAPH conference, is part of NVIDIA's expanding Cosmos world model family and represents a leap forward in enabling robots and AI agents to "reason" about the physical world1
2
.Source: Observer
Cosmos Reason is built to serve as a planning model for embodied agents, allowing them to understand concepts like physics, object permanence, and space-time alignment
2
. Its applications span across various domains:2
5
.Companies like Uber, Magna, and VAST Data are already adopting Cosmos Reason for diverse applications, from annotating autonomous vehicle training data to enhancing city delivery platforms
5
.Alongside Cosmos Reason, NVIDIA has also introduced updates to its Nemotron model family, which focuses on enterprise AI applications
2
. The new models include:2
.These models are designed to provide AI agents with improved reasoning capabilities, enabling them to explore broader options and deliver smarter results within set time limits.
Source: NVIDIA Blog
To support the development and deployment of these advanced AI models, NVIDIA has announced several updates to its Omniverse platform and infrastructure:
5
.5
.1
.Source: TechCrunch
Related Stories
The announcements have garnered significant interest from industry leaders. Companies like CrowdStrike, Uber, Zoom, and Boston Dynamics are among the early adopters of these new technologies
2
5
. The potential impact is substantial, with AI agents expected to deliver up to $450 billion in revenue gains and cost savings by 2028, according to Capgemini2
.NVIDIA's latest innovations represent a convergence of graphics, AI, and robotics technologies. As Rev Lebaredian, VP of Omniverse and simulation technologies at NVIDIA, stated, "Computer graphics and AI are converging to fundamentally transform robotics"
5
. This integration of AI reasoning with scalable, physically accurate simulation is poised to enable developers to build more advanced robots and autonomous vehicles, potentially transforming trillion-dollar industries5
.The introduction of Cosmos Reason and the enhancements to the Nemotron models mark a significant step forward in the field of physical AI. As these technologies continue to evolve, they promise to unlock new possibilities in robotics, autonomous systems, and AI-driven decision-making across various sectors.
Summarized by
Navi
[1]
[3]
[4]
1
Business and Economy
2
Technology
3
Business and Economy