Critical Vulnerabilities in Nvidia's Triton Inference Server Expose AI Models to Potential Attacks

5 Sources

Share

Security researchers uncover a chain of high-severity vulnerabilities in Nvidia's Triton Inference Server that could lead to remote code execution and AI model theft. Nvidia releases patches to address the issues.

Vulnerability Discovery in Nvidia's Triton Inference Server

Security researchers from Wiz have uncovered a chain of high-severity vulnerabilities in Nvidia's Triton Inference Server, an open-source platform designed for running AI models at scale. These flaws, if exploited, could potentially lead to remote code execution (RCE) and expose organizations to significant risks

1

.

Source: Dataconomy

Source: Dataconomy

The Vulnerability Chain

The researchers identified three critical vulnerabilities in the Triton Inference Server's Python backend:

  1. CVE-2025-23320 (CVSS score: 7.5): A flaw that allows attackers to exceed the shared memory limit by sending a very large request, revealing the unique name of the backend's internal IPC shared memory region

    2

    .

  2. CVE-2025-23319 (CVSS score: 8.1): An out-of-bounds write vulnerability that can be exploited using the information leaked from CVE-2025-23320 .

  3. CVE-2025-23334 (CVSS score: 5.9): An out-of-bounds read vulnerability that, when combined with the other flaws, completes the attack chain .

Potential Impact and Risks

If successfully exploited, these vulnerabilities could allow an unauthenticated attacker to gain complete control of the Triton Inference Server. The potential consequences include:

  1. Theft of valuable AI models
  2. Exposure of sensitive data
  3. Manipulation of AI model responses
  4. Establishment of a foothold for deeper network penetration

    3

Source: The Hacker News

Source: The Hacker News

Widespread Usage and Affected Systems

Triton Inference Server is used by numerous organizations for AI/ML workloads, including major companies such as Microsoft, Amazon, Oracle, Siemens, and American Express. A 2021 press release indicated that over 25,000 companies use Nvidia's AI stack

4

.

The vulnerabilities affect both Windows and Linux systems running the Triton Inference Server

3

.

Nvidia's Response and Mitigation

Nvidia has addressed these vulnerabilities in version 25.07 of the Triton Inference Server, released on August 4, 2025. The company strongly recommends all users to update to this latest version immediately

1

.

Nir Ohfeld, Wiz's Head of Vulnerability Research, emphasized the importance of updating: "The single most important step is to update to the patched version of the Nvidia Triton Inference Server (version 25.07 or newer). This directly fixes the entire vulnerability chain."

5

Broader Implications for AI Security

Source: TechRadar

Source: TechRadar

This incident highlights the growing importance of security in AI infrastructure. As companies increasingly deploy AI and machine learning technologies, securing the underlying infrastructure becomes paramount. The discovery of these vulnerabilities underscores the need for a defense-in-depth approach, where security is considered at every layer of an application

1

.

While there is currently no evidence of these vulnerabilities being exploited in the wild, the widespread use of Nvidia's Triton Inference Server in AI workloads makes it a potentially attractive target for attackers

5

.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo