3 Sources
[1]
NVIDIA and Microsoft Advance Development on RTX AI PCs
NVIDIA TensorRT for RTX available via Windows ML to accelerate AI in apps, new NVIDIA NIM microservices and AI Blueprints enable developers to build new workflows, and Project G-Assist plug-ins enable enthusiasts to create assistant workflows. Generative AI is transforming PC software into breakthrough experiences -- from digital humans to writing assistants, intelligent agents and creative tools. NVIDIA RTX AI PCs are powering this transformation with technology that makes it simpler to get started experimenting with generative AI and unlock greater performance on Windows 11. NVIDIA TensorRT has been reimagined for RTX AI PCs, combining industry-leading TensorRT performance with just-in-time, on-device engine building and an 8x smaller package size for seamless AI deployment to more than 100 million RTX AI PCs. Announced at Microsoft Build, TensorRT for RTX is natively supported by Windows ML -- a new inference stack that provides app developers with both broad hardware compatibility and state-of-the-art performance. For developers looking for AI features ready to integrate, NVIDIA software development kits (SDKs) offer a wide array of options, from NVIDIA DLSS to multimedia enhancements like NVIDIA RTX Video. This month, top software applications from Autodesk, Bilibili, Chaos, LM Studio and Topaz Labs are releasing updates to unlock RTX AI features and acceleration. AI enthusiasts and developers can easily get started with AI using NVIDIA NIM -- prepackaged, optimized AI models that can run in popular apps like AnythingLLM, Microsoft VS Code and ComfyUI. Releasing this week, the FLUX.1-schnell image generation model will be available as a NIM microservice, and the popular FLUX.1-dev NIM microservice has been updated to support more RTX GPUs. Those looking for a simple, no-code way to dive into AI development can tap into Project G-Assist -- the RTX PC AI assistant in the NVIDIA app -- to build plug-ins to control PC apps and peripherals using natural language AI. New community plug-ins are now available, including Google Gemini web search, Spotify, Twitch, IFTTT and SignalRGB. Accelerated AI Inference With TensorRT for RTX Today's AI PC software stack requires developers to compromise on performance or invest in custom optimizations for specific hardware. Windows ML was built to solve these challenges. Windows ML is powered by ONNX Runtime and seamlessly connects to an optimized AI execution layer provided and maintained by each hardware manufacturer. For GeForce RTX GPUs, Windows ML automatically uses the TensorRT for RTX inference library for high performance and rapid deployment. Compared with DirectML, TensorRT delivers over 50% faster performance for AI workloads on PCs. Windows ML also delivers quality-of-life benefits for developers. It can automatically select the right hardware -- GPU, CPU or NPU -- to run each AI feature, and download the execution provider for that hardware, removing the need to package those files into the app. This allows for the latest TensorRT performance optimizations to be delivered to users as soon as they're ready. TensorRT, a library originally built for data centers, has been redesigned for RTX AI PCs. Instead of pre-generating TensorRT engines and packaging them with the app, TensorRT for RTX uses just-in-time, on-device engine building to optimize how the AI model is run for the user's specific RTX GPU in mere seconds. And the library's packaging has been streamlined, reducing its file size significantly by 8x. TensorRT for RTX is available to developers through the Windows ML preview today, and will be available as a standalone SDK at NVIDIA Developer in June. Developers looking to add AI features or boost app performance can tap into a broad range of NVIDIA SDKs. These include NVIDIA CUDA and TensorRT for GPU acceleration; NVIDIA DLSS and Optix for 3D graphics; NVIDIA RTX Video and Maxine for multimedia; and NVIDIA Riva and ACE for generative AI. Top applications are releasing updates this month to enable unique features using these NVIDIA SDKs, including: NVIDIA looks forward to continuing to work with Microsoft and top AI app developers to help them accelerate their AI features on RTX-powered machines through the Windows ML and TensorRT integration. Local AI Made Easy With NIM Microservices and AI Blueprints Getting started with developing AI on PCs can be daunting. AI developers and enthusiasts have to select from over 1.2 million AI models on Hugging Face, quantize it into a format that runs well on PC, find and install all the dependencies to run it, and more. NVIDIA NIM makes it easy to get started by providing a curated list of AI models, prepackaged with all the files needed to run them and optimized to achieve full performance on RTX GPUs. And since they're containerized, the same NIM microservice can be run seamlessly across PCs or the cloud. NVIDIA NIM microservices are available to download through build.nvidia.com or through top AI apps like Anything LLM, ComfyUI and AI Toolkit for Visual Studio Code. During COMPUTEX, NVIDIA will release the FLUX.1-schnell NIM microservice -- an image generation model from Black Forest Labs for fast image generation -- and update the FLUX.1-dev NIM microservice to add compatibility for a wide range of GeForce RTX 50 and 40 Series GPUs. These NIM microservices enable faster performance with TensorRT and quantized models. On NVIDIA Blackwell GPUs, they run over twice as fast as running them natively, thanks to FP4 and RTX optimizations. AI developers can also jumpstart their work with NVIDIA AI Blueprints -- sample workflows and projects using NIM microservices. NVIDIA last month released the NVIDIA AI Blueprint for 3D-guided generative AI, a powerful way to control composition and camera angles of generated images by using a 3D scene as a reference. Developers can modify the open-source blueprint for their needs or extend it with additional functionality. New Project G-Assist Plug-Ins and Sample Projects Now Available NVIDIA recently released Project G-Assist as an experimental AI assistant integrated into the NVIDIA app. G-Assist enables users to control their GeForce RTX system using simple voice and text commands, offering a more convenient interface compared to manual controls spread across numerous legacy control panels. Developers can also use Project G-Assist to easily build plug-ins, test assistant use cases and publish them through NVIDIA's Discord and GitHub. The Project G-Assist Plug-in Builder -- a ChatGPT-based app that allows no-code or low-code development with natural language commands -- makes it easy to start creating plug-ins. These lightweight, community-driven add-ons use straightforward JSON definitions and Python logic. New open-source plug-in samples are available now on GitHub, showcasing diverse ways on-device AI can enhance PC and gaming workflows. They include: Explore the GitHub repository for more examples -- including hands-free music control via Spotify, livestream status checks with Twitch, and more. Companies are adopting AI as the new PC interface. For example, SignalRGB is developing a G-Assist plug-in that enables unified lighting control across multiple manufacturers. Users will soon be able to install this plug-in directly from the SignalRGB app. Starting this week, the AI community will also be able to use G-Assist as a custom component in Langflow -- enabling users to integrate function-calling capabilities in low-code or no-code workflows, AI applications and agentic flows. Enthusiasts interested in developing and experimenting with Project G-Assist plug-ins are invited to join the NVIDIA Developer Discord channel to collaborate, share creations and gain support. Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations.
[2]
NVIDIA and Microsoft Accelerate Agentic AI Innovation, From Cloud to PC
Agentic AI is redefining scientific discovery and unlocking research breakthroughs and innovations across industries. Through deepened collaboration, NVIDIA and Microsoft are delivering advancements that accelerate agentic AI-powered applications from the cloud to the PC. At Microsoft Build, Microsoft unveiled Microsoft Discovery, an extensible platform built to empower researchers to transform the entire discovery process with agentic AI. This will help research and development departments across various industries accelerate the time to market for new products, as well as speed and expand the end-to-end discovery process for all scientists. Microsoft Discovery will integrate the NVIDIA ALCHEMI NIM microservice, which optimizes AI inference for chemical simulations, to accelerate materials science research with property prediction and candidate recommendation. The platform will also integrate NVIDIA BioNeMo NIM microservices, tapping into pretrained AI workflows to speed up AI model development for drug discovery. These integrations equip researchers with accelerated performance for faster scientific discoveries. In testing, researchers at Microsoft used Microsoft Discovery to detect a novel coolant prototype with promising properties for immersion cooling in data centers in under 200 hours, rather than months or years with traditional methods. Advancing Agentic AI With NVIDIA GB200 Deployments at Scale Microsoft is rapidly deploying tens of thousands of NVIDIA GB200 NVL72 rack-scale systems across its Azure data centers, boosting both performance and efficiency. Azure's ND GB200 v6 virtual machines -- built on a rack-scale architecture with up to 72 NVIDIA Blackwell GPUs per rack and advanced liquid cooling -- deliver up to 35x more inference throughput compared with previous ND H100 v5 VMs accelerated by eight NVIDIA H100 GPUs, setting a new benchmark for AI workloads. These innovations are underpinned by custom server designs, high-speed NVIDIA NVLink interconnects and NVIDIA Quantum InfiniBand networking -- enabling seamless scaling to tens of thousands of Blackwell GPUs for demanding generative and agentic AI applications. Microsoft chairman and CEO Satya Nadella and NVIDIA founder and CEO Jensen Huang also highlighted how Microsoft and NVIDIA's collaboration is compounding performance gains through continuous software optimizations across NVIDIA architectures on Azure. This approach maximizes developer productivity, lowers total cost of ownership and accelerates all workloads, including AI and data processing -- all while driving greater efficiency per dollar and per watt for customers. NVIDIA AI Reasoning and Healthcare Microservices on Azure AI Foundry Building on the NIM integration in Azure AI Foundry, announced at NVIDIA GTC, Microsoft and NVIDIA are expanding the platform with the NVIDIA Llama Nemotron family of open reasoning models and NVIDIA BioNeMo NIM microservices, which deliver enterprise-grade, containerized inferencing for complex decision-making and domain-specific AI workloads. Developers can now access optimized NIM microservices for advanced reasoning in Azure AI Foundry. These include the NVIDIA Llama Nemotron Super and Nano models, which offer advanced multistep reasoning, coding and agentic capabilities, delivering up to 20% higher accuracy and 5x faster inference than previous models. Healthcare-focused BioNeMo NIM microservices like ProteinMPNN, RFDiffusion and OpenFold2 address critical applications in digital biology, drug discovery and medical imaging, enabling researchers and clinicians to accelerate protein science, molecular modeling and genomic analysis for improved patient care and faster scientific innovation. This expanded integration empowers organizations to rapidly deploy high-performance AI agents, connecting to these models and other specialized healthcare solutions with robust reliability and simplified scaling. Accelerating Generative AI on Windows 11 With RTX AI PCs Generative AI is reshaping PC software with entirely new experiences -- from digital humans to writing assistants, intelligent agents and creative tools. NVIDIA RTX AI PCs make it easy to get it started with experimenting with generative AI and unlock greater performance on Windows 11. At Microsoft Build, NVIDIA and Microsoft are unveiling an AI inferencing stack to simplify development and boost inference performance for Windows 11 PCs. NVIDIA TensorRT has been reimagined for RTX AI PCs, combining industry-leading TensorRT performance with just-in-time, on-device engine building and an 8x smaller package size for seamless AI deployment to the more than 100 million RTX AI PCs. Announced at Microsoft Build, TensorRT for RTX is natively supported by Windows ML -- a new inference stack that provides app developers with both broad hardware compatibility and state-of-the-art performance. TensorRT for RTX is available in the Windows ML preview starting today, and will be available as a standalone software development kit from NVIDIA Developer in June. Learn more about how TensorRT for RTX and Windows ML are streamlining software development. Explore new NIM microservices and AI Blueprints for RTX, and RTX-powered updates from Autodesk, Bilibili, Chaos, LM Studio and Topaz in the RTX AI PC blog, and join the community discussion on Discord.
[3]
Nvidia and Microsoft accelerate AI processing on PCs
Generative AI is transforming PC software into breakthrough experiences -- from digital humans to writing assistants, intelligent agents and creative tools. Nvidia RTX AI PCs are powering this transformation with technology that makes it simpler to get started experimenting with generative AI, and unlocking greater performance on Windows 11. TensorRT for RTX AI PCs TensorRT has been reimagined for RTX AI PCs, combining industry leading TensorRT performance with just-in-time on-device engine building and an 8x smaller package size for fast AI deployment to the more than 100 million RTX AI PCs. Announced at Microsoft Build, TensorRT for RTX is natively supported by Windows ML -- a new inference stack that provides app developers with both broad hardware compatibility and state of the art performance. Gerardo Delgado, director of product for AI PC at Nvidia, said in a press briefing that the AI PCs start with Nvidia's RTX hardware, CUDA programming and an array of AI models. He noted that at a high level, an AI model is basically a set of mathematical operations along with a way to run them. And the combination of operations and how to run them is what is normally known as a graph in machine learning. He added, "Our GPUs are going to execute these operations with Tensor cores. But Tensor cores change from generation to generatio. We have been implementing them from time to time, and then within a generation of GPUs, you also have different Tensor code counts depending on the schema. Being able to match what's the right Tensor code for each mathematical operation is the key to achieving performance. So a TensorRT does this in a two step approach." First, Nvidia has to optimize the AI model. It has to quantize the model so it reduces the precision of parts of the model or some of the layers. Once Nvidia has optimized model, TensorRT consumes that optimized model, and then Nvidia basically prepares a plan with a pre-selection of kernels." If you compare this to a standard way of running AI on Windows, Nvidia can achieve about a 1.6 times performance on average. Now there will be a new version of TensorRT for RTX to improve this experience. It's designed specifically for RTX AI PCs and it provides the same TensorRT performance, but instead of having to pre-generate the TensorRT engines per GPU, it will focus on optimizing the model, and it will ship a generic TensorRT engine. "Then once the application is installed, TensorRT for RTX will generate the right TensorRT engine for your specific GPU in just seconds. This greatly simplifies the developer workflow," he said. Among the results are a reduction in size of of libraries, better performance for video generation, and better quality livestreams, Delgado said. Nvidia SDKs make it easier for app developers to integrate AI features and accelerate their apps on GeForce RTX GPUs. This month top software applications from Autodesk, Bilibili, Chaos, LM Studio and Topaz are releasing updates to unlock RTX AI features and acceleration. AI enthusiasts and developers can easily get started with AI using Nvidia NIM, pre-packaged, optimized AI models that run in popular apps like AnythingLLM, Microsoft VS Code and ComfyUI. The FLUX.1-schnell image generation model is now available as a NIM, and the popular FLUX.1-dev NIM has been updated to support more RTX GPUs. For a no-code option to dive into AI development, Project G-Assist -- the RTX PC AI assistant in the Nvidia app -- has enabled a simple way to build plug-ins to create assistant workflows. New community plug-ins are now available including Google Gemini web search, Spotify, Twitch, IFTTT and SignalRGB. Accelerated AI inference with TensorRT for RTX Today's AI PC software stack requires developers to choose between frameworks that have broad hardware support but lower performance, or optimized paths that only cover certain hardware or model types and require the developer to maintain multiple paths. The new Windows ML inference framework was built to solve these challenges. Windows ML is built on top of ONNX Runtime and seamlessly connects to an optimized AI execution layer provided and maintained by each hardware manufacturer. For GeForce RTX GPUs, Windows ML automatically uses TensorRT for RTX -- an inference library optimized for high performance and rapid deployment. Compared to DirectML, TensorRT delivers over 50% faster performance for AI workloads on PCs. Windows ML also delivers quality of life benefits for the developer. It can automatically select the right hardware to run each AI feature, and download the execution provider for that hardware, removing the need to package those files into their app. This allows Nvidia to provide the latest TensorRT performance optimizations to users as soon as they are ready. And because it's built on ONNX Runtime, Windows ML works with any ONNX model. To further enhance the experience for developers, TensorRT has been reimagined for RTX. Instead of having to pre-generate TensorRT engines and package them with the app, TensorRT for RTX uses just-in-time, on-device engine building to optimize how the AI model is run for the user's specific RTX GPU in mere seconds. And the library has been streamlined, reducing its file size by a massive eight times. TensorRT for RTX is available to developers through the Windows ML preview today, and will be available directly as a standalone SDK at Nvidia Developer, targeting a June release. Developers can learn more in Nvidia's Microsoft Build Developer Blog, the TensorRT for RTX launch blog, and Microsoft's Windows ML blog. Expanding the AI ecosystem on Windows PCs Developers looking to add AI features or boost app performance can tap into a broad range of Nvidia SDKs. These include CUDA and TensortRT for GPU acceleration; DLSS and Optix for 3D graphics; RTX Video and Maxine for multimedia; and Riva, Nemotron or ACE for generative AI. Top applications are releasing updates this month to enable Nvidia unique features using these SDKs. Topaz is releasing a generative AI video model to enhance video quality accelerated by CUDA. Chaos Enscape and Autodesk VRED are adding DLSS 4 for faster performance and better image quality. BiliBili is integrating Nvidia Broadcast features, enabling streamers to activate Nvidia Virtual Background directly within Bilibili Livehime to enhance the quality of livestreams. Local AI made easy with NIM Microservices and AI blueprints Getting started with developing AI on PCs can be daunting. AI developers and enthusiasts have to select from over 1.2 million AI models on Hugging Face, quantize it into a format that runs well on PC, find and install all the dependencies to run it, and more. Nvidia NIM makes it easy to get started by providing a curated list of AI models, pre-packaged with all the files needed to run them, and optimized to achieve full performance on RTX GPUs. And as containerized microservices, the same NIM can be run seamlessly across PC or cloud. A NIM is a package -- a generative AI model that's been prepackaged with everything you need to run it. It's already optimized with TensorRT for RTX GPUs, and it comes with an easy to use API that's open-API compatible, which makes it compatible with all of the top AI applications that users are using today. At Computex, Nvidia is releasing the FLUX.1-schnell NIM -- an image generation model from Black Forest Labs for fast image generation -- and updating the FLUX.1-dev NIM to add compatibility for a wide range of GeForce RTX 50 and 40 Series GPUs. These NIMs enable faster performance with TensorRT, plus additional performance thanks to quantized models. On Blackwell GPUs, these run over twice as fast as running them natively, thanks to FP4 and RTX optimizations. AI developers can also jumpstart their work with Nvidia AI Blueprints -- sample workflows and projects using NIM. Last month Nvidia released the 3D Guided Generative AI Blueprint, a powerful way to control composition and camera angles of generated images by using a 3D scene as a reference. Developers can modify the open source blueprint for their needs or extend it with additional functionality. New Project G-Assist plug-ins and sample projects now available Nvidia recently released Project G-Assist as an experimental AI assistant integrated into the Nvidia app. G-Assist enables users to control their GeForce RTX system using simple voice and text commands, offering a more convenient interface compared to manual controls spread across multiple legacy control panels. Developers can also use Project G-Assist to easily build plug-ins, test assistant use cases and publish them through Nvidia's Discord and GitHub. To make it easier to get started creating plug-ins, Nvidia has made available the easy-to use Plug-in Builder -- a ChatGPT-based app that allows no-code/low-code development with natural language commands. These lightweight, community-driven add-ons leverage straightforward JSON definitions and Python logic. New open-source samples are available now on GitHub, showcasing diverse ways how on device AI can enhance your PC and gaming workflows. ● Gemini: The existing Gemini plug-in that uses Google's cloud-based free-to-use LLM has been updated to include real-time web search capabilities. ● IFTTT: Enable automations from the hundreds of end points that work with IFTTT, such as IoT and home automation systems, enabling routines spanning digital setups and physical surroundings. ● Discord: Easily share game highlights, or messages directly to Discord servers without disrupting gameplay. Explore the GitHub repository for additional examples -- including hands-free music control via Spotify, livestream status checks with Twitch, and more. Project G-Assist -- AI Assistant For Your RTX PC Companies are also adopting AI as the new PC interface. For example, SignalRGB is developing a G-Assist plugin that enables unified lighting control across multiple manufacturers. SignalRGB users will soon be able to install this plug-in directly from the SignalRGB app. Enthusiasts interested in developing and experimenting with Project G-Assist plug-ins are invited to join the Nvidia Developer Discord channel to collaborate, share creations and receive support during development. Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations.
Share
Copy Link
NVIDIA and Microsoft announce new tools and technologies to enhance AI capabilities on Windows PCs, including TensorRT for RTX and Windows ML integration, aimed at simplifying AI development and boosting performance.
NVIDIA and Microsoft have announced a significant collaboration to enhance AI capabilities on Windows PCs, introducing new tools and technologies that promise to simplify AI development and boost performance 1.
At the heart of this collaboration is TensorRT for RTX, a reimagined version of NVIDIA's TensorRT library specifically designed for RTX AI PCs. This new iteration combines TensorRT's industry-leading performance with just-in-time, on-device engine building, and boasts an 8x smaller package size 1.
Key features of TensorRT for RTX include:
Microsoft's Windows ML, built on top of ONNX Runtime, aims to solve challenges in the current AI PC software stack. It provides:
To further facilitate AI development, NVIDIA introduced:
NVIDIA is providing a range of SDKs to help developers add AI features and boost app performance. These include CUDA, TensorRT, DLSS, Optix, RTX Video, Maxine, Riva, and ACE 3.
Several top software applications are releasing updates to leverage these new AI capabilities:
These developments mark a significant step forward in making AI more accessible and powerful on consumer PCs, potentially transforming various applications from digital assistants to creative tools.
Google launches its new Pixel 10 smartphone series, showcasing advanced AI capabilities powered by Gemini, aiming to challenge competitors in the premium handset market.
20 Sources
Technology
2 hrs ago
20 Sources
Technology
2 hrs ago
Google's Pixel 10 series introduces groundbreaking AI features, including Magic Cue, Camera Coach, and Voice Translate, powered by the new Tensor G5 chip and Gemini Nano model.
12 Sources
Technology
2 hrs ago
12 Sources
Technology
2 hrs ago
NASA and IBM have developed Surya, an open-source AI model that can predict solar flares and space weather with improved accuracy, potentially helping to protect Earth's infrastructure from solar storm damage.
6 Sources
Technology
10 hrs ago
6 Sources
Technology
10 hrs ago
Google's latest smartwatch, the Pixel Watch 4, introduces significant upgrades including a curved display, enhanced AI features, and improved health tracking capabilities.
17 Sources
Technology
2 hrs ago
17 Sources
Technology
2 hrs ago
FieldAI, a robotics startup, has raised $405 million to develop "foundational embodied AI models" for various robot types. The company's innovative approach integrates physics principles into AI, enabling safer and more adaptable robot operations across diverse environments.
7 Sources
Technology
2 hrs ago
7 Sources
Technology
2 hrs ago