Curated by THEOUTPOST
On Fri, 7 Mar, 12:02 AM UTC
2 Sources
[1]
New AI defense method shields models from adversarial attacks
Neural networks, a type of artificial intelligence modeled on the connectivity of the human brain, are driving critical breakthroughs across a wide range of scientific domains. But these models face significant threat from adversarial attacks, which can derail predictions and produce incorrect information. Los Alamos National Laboratory researchers have now pioneered a novel purification strategy that counteracts adversarial assaults and preserves the robust performance of neural networks. Their research is published on the arXiv preprint server. "Adversarial attacks to AI systems can take the form of tiny, near-invisible tweaks to input images, subtle modifications that can steer the model toward the outcome an attacker wants," said Manish Bhattarai, Los Alamos computer scientist. "Such vulnerabilities allow malicious actors to flood digital channels with deceptive or harmful content under the guise of genuine outputs, posing a direct threat to trust and reliability in AI-driven technologies." The Low-Rank Iterative Diffusion (LoRID) method removes adversarial interventions from input data by harnessing the power of generative denoising diffusion processes in tandem with advanced tensor decomposition techniques. In a series of tests on benchmarking datasets, LoRID achieved unparalleled accuracy in neutralizing adversarial noise in attack scenarios, potentially advancing a more secure, reliable AI capability. Defeating dangerous noise Diffusion is a technique for training AI models by adding noise to data and then teaching the models to remove it. By learning to clean up the noise, the AI model effectively learns the underlying structure of the data, enabling it to generate realistic samples on its own. In diffusion-based purification, the model leverages its learned representation of "clean" data to identify and eliminate any adversarial interference introduced into the input. Unfortunately, applying too many noise-purifying steps can strip away essential details from the data -- imagine scrubbing a photo so aggressively that it loses clarity -- while too few steps leaves room for harmful perturbations to linger. The LoRID method navigates this trade-off by employing multiple rounds of denoising at the earlier phases of the diffusion process, helping the model eliminate precisely the right amount of noise without compromising the meaningful content of the data, thereby fortifying the model against attacks. Crucially, adversarial inputs often reveal subtle "low-rank" signatures -- patterns that can slip past complex defenses. By weaving in a technique called tensor factorization, LoRID pinpoints these low-rank aspects, bolstering the model's defense in large adversarial attack regimes. The team tested LoRID using widely recognized benchmark datasets such as CIFAR-10, CIFAR-100, Celeb-HQ, and ImageNet, evaluating its performance against state-of-the-art black-box and white-box adversarial attacks. In white-box attacks, adversaries have full knowledge of the AI model's architecture and parameters. In black-box attacks, they only see inputs and outputs, with the model's internal workings hidden. Across every test, LoRID consistently outperformed other methods, particularly in terms of robust accuracy -- the key indicator of a model's reliability when under adversarial threat. Venado helps unlock efficiency, results The team ran the LoRID models on Venado, the Lab's newest, AI-capable supercomputer, to test a range of state-of-the-art vision models against both black-box and white-box adversarial attacks. By harnessing multiple Venado nodes for several weeks -- an ambitious effort given the massive computing requirements -- they became the first group to undertake such a comprehensive analysis. Venado's power turned months of simulation into mere hours, slashing the total development timeline from years to just one month and significantly reducing computational costs. Robust purification methods can enhance AI security wherever neural network or machine learning applications are applied, including potentially in the Laboratory's national security mission. "Our method has set a new benchmark in state-of-the-art performance across renowned datasets, excelling under both white-box and black-box attack scenarios," said Minh Vu, Los Alamos AI researcher. "This achievement means we can now purify the data -- whether sourced privately or publicly -- before using it to train foundational models, ensuring their safety and integrity while consistently delivering accurate results."
[2]
New AI defense method shields models from adversarial attacks | Newswise
Newswise -- Neural networks, a type of artificial intelligence modeled on the connectivity of the human brain, are driving critical breakthroughs across a wide range of scientific domains. But these models face significant threat from adversarial attacks, which can derail predictions and produce incorrect information. Los Alamos National Laboratory researchers have pioneered a novel purification strategy that counteracts adversarial assaults and preserves the robust performance of neural networks. "Adversarial attacks to AI systems can take the form of tiny, near invisible tweaks to input images, subtle modifications that can steer the model toward the outcome an attacker wants," said Manish Bhattarai, Los Alamos computer scientist. "Such vulnerabilities allow malicious actors to flood digital channels with deceptive or harmful content under the guise of genuine outputs, posing a direct threat to trust and reliability in AI-driven technologies." The Low-Rank Iterative Diffusion (LoRID) method removes adversarial interventions from input data by harnessing the power of generative denoising diffusion processes in tandem with advanced tensor decomposition techniques. In a series of tests on benchmarking datasets, LoRID achieved unparalleled accuracy in neutralizing adversarial noise in attack scenarios, potentially advancing a more secure, reliable AI capability. Defeating dangerous noise Diffusion is a technique for training AI models by adding noise to data and then teaching the models to remove it. By learning to clean up the noise, the AI model effectively learns the underlying structure of the data, enabling it to generate realistic samples on its own. In diffusion-based purification, the model leverages its learned representation of "clean" data to identify and eliminate any adversarial interference introduced into the input. Unfortunately, applying too many noise-purifying steps can strip away essential details from the data -- imagine scrubbing a photo so aggressively that it loses clarity -- while too few steps leaves room for harmful perturbations to linger. The LoRID method navigates this trade-off by employing multiple rounds of denoising at the earlier phases of the diffusion process, helping the model eliminate precisely the right amount of noise without compromising the meaningful content of the data, thereby fortifying the model against attacks. Crucially, adversarial inputs often reveal subtle "low-rank" signatures -- patterns that can slip past complex defenses. By weaving in a technique called tensor factorization, LoRID pinpoints these low-rank aspects, bolstering the model's defense in large adversarial attack regimes. The team tested LoRID using widely recognized benchmark datasets such as CIFAR-10, CIFAR-100, Celeb-HQ, and ImageNet, evaluating its performance against state-of-the-art black-box and white-box adversarial attacks. In white-box attacks, adversaries have full knowledge of the AI model's architecture and parameters. In black-box attacks, they only see inputs and outputs, with the model's internal workings hidden. Across every test, LoRID consistently outperformed other methods, particularly in terms of robust accuracy -- the key indicator of a model's reliability when under adversarial threat. Venado helps unlocks efficiency, results The team ran the LoRID models on Venado, the Lab's newest, AI-capable supercomputer, to test a range of state-of-the-art vision models against both black-box and white-box adversarial attacks. By harnessing multiple Venado nodes for several weeks -- an ambitious effort given the massive compute requirements -- they became the first group to undertake such a comprehensive analysis. Venado's power turned months of simulation into mere hours, slashing the total development timeline from years to just one month and significantly reducing computational costs. Robust purification methods can enhance AI security wherever neural network or machine learning applications are applied, including potentially in the Laboratory's national security mission. "Our method has set a new benchmark in state-of-the-art performance across renowned datasets, excelling under both white-box and black-box attack scenarios," said Minh Vu, Los Alamos AI researcher. "This achievement means we can now purify the data -- whether sourced privately or publicly -- before using it to train foundational models, ensuring their safety and integrity while consistently delivering accurate results." The team presented their work and results at the prestigious AAAI Conference on Artificial Intelligence, known as AAAI-2025, hosted by the Association for the Advancement of Artificial Intelligence.
Share
Share
Copy Link
Scientists at Los Alamos National Laboratory have created a novel AI defense method called Low-Rank Iterative Diffusion (LoRID) that effectively shields neural networks from adversarial attacks, setting a new benchmark in AI security.
Researchers at Los Alamos National Laboratory have developed a groundbreaking AI defense strategy called Low-Rank Iterative Diffusion (LoRID), designed to protect neural networks from adversarial attacks. This innovative method has demonstrated unparalleled accuracy in neutralizing adversarial noise, potentially advancing more secure and reliable AI capabilities 12.
Neural networks, while driving critical breakthroughs across various scientific domains, face significant threats from adversarial attacks. These attacks can derail predictions and produce incorrect information, posing a direct threat to the trust and reliability of AI-driven technologies. Manish Bhattarai, a Los Alamos computer scientist, explains that these attacks often take the form of "tiny, near-invisible tweaks to input images" that can steer the model toward an attacker's desired outcome 1.
The LoRID method employs a combination of generative denoising diffusion processes and advanced tensor decomposition techniques to remove adversarial interventions from input data. This approach navigates the delicate balance between eliminating harmful noise and preserving essential data details 12.
Key features of LoRID include:
The team tested LoRID using widely recognized benchmark datasets such as CIFAR-10, CIFAR-100, Celeb-HQ, and ImageNet. The method was evaluated against state-of-the-art black-box and white-box adversarial attacks 1.
LoRID consistently outperformed other methods across all tests, particularly in terms of robust accuracy - the key indicator of a model's reliability under adversarial threat 2.
The research team leveraged Venado, Los Alamos' newest AI-capable supercomputer, to conduct their comprehensive analysis. This powerful computing resource significantly reduced the development timeline from years to just one month, demonstrating the importance of advanced computing infrastructure in AI research 12.
The success of LoRID has far-reaching implications for AI security. Minh Vu, a Los Alamos AI researcher, notes that this achievement allows for the purification of data before using it to train foundational models, ensuring their safety and integrity while consistently delivering accurate results 2.
The robust purification methods developed through this research can enhance AI security across various applications of neural networks and machine learning, potentially including the Laboratory's national security mission 1.
The team presented their groundbreaking work at the prestigious AAAI Conference on Artificial Intelligence (AAAI-2025), hosted by the Association for the Advancement of Artificial Intelligence. This presentation underscores the significance of their contribution to the field of AI security 2.
Reference
MIT researchers have developed an enhanced version of the PAC Privacy framework, improving the balance between AI model accuracy and data privacy protection. This new method is more computationally efficient and can be applied to various algorithms without accessing their inner workings.
2 Sources
2 Sources
As AI technologies advance, cybersecurity faces new challenges and opportunities. This story explores the intersection of AI and cybersecurity, highlighting NVIDIA's role and the broader implications for system protection in the age of generative AI.
2 Sources
2 Sources
Cisco launches AI Defense to address the widening gap between adversarial AI and defensive AI, offering real-time monitoring, model validation, and policy enforcement at scale.
2 Sources
2 Sources
As AI enhances cyber threats, organizations must adopt AI-driven security measures to stay ahead. Experts recommend implementing zero-trust architecture, leveraging AI for defense, and addressing human factors to combat sophisticated AI-powered attacks.
4 Sources
4 Sources
Microsoft's AI Red Team, after probing over 100 generative AI products, highlights the amplification of existing security risks and the emergence of new challenges in AI systems. The team emphasizes the ongoing nature of AI security work and the crucial role of human expertise in addressing these evolving threats.
4 Sources
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved