North Korean Hackers Deploy Gaslight Malware to Deceive AI Security Tools with Fake Errors

2 Sources

Share

A newly discovered macOS malware called Gaslight uses prompt injection to confuse AI-assisted analysis platforms. Attributed to North Korea-aligned threat actors, the Rust-based implant embeds 38 fabricated system messages designed to make AI triage agents abort or refuse analysis. The malware also functions as an information stealer with Telegram-based command-and-control capabilities.

North Korean Threat Actors Deploy Novel Evasion Technique

Cybersecurity researchers at SentinelOne have uncovered a previously undocumented macOS malware that represents a significant shift in how threat actors approach detection evasion. The Gaslight malware, attributed with high confidence to North Korea-aligned threat actors, introduces a novel adversarial technique specifically designed to deceive AI-assisted malware analysis tools rather than traditional sandbox environments

1

. This Rust-based implant marks what researchers believe is the first documented attempt to weaponize AI security workflows through embedded prompt injection payloads.

Source: BleepingComputer

Source: BleepingComputer

How Gaslight Malware Manipulates AI Triage Agents

At the core of this macOS malware lies a 3.5 KB payload containing 38 fabricated system messages embedded directly within the binary

2

. These fake error messages are crafted to target AI-driven security tools that increasingly assist with reverse engineering and threat detection. The prompt injection content includes fabricated memory dumps, token-expiration warnings, Redis connection failures, build-pipeline errors, SQL injection alerts, and disk exhaustion notifications—all designed to make LLM-based triage systems question their own session validity

1

. "It attacks the agent's perception, rather than the sandbox it runs in," explained SentinelOne researcher Phil Stokes, highlighting how the malware embeds fake error messages using Markdown formatting and template-style placeholders to appear like legitimate analysis data.

Source: Hacker News

Source: Hacker News

Telegram Bot API Powers Command-and-Control Operations

Beyond its AI-evasion capabilities, Gaslight functions as a fully operational backdoor with information-stealing features. The malware establishes a Telegram bot API-based command-and-control channel that enters a polling loop, enabling operators to issue instructions through an interactive shell

1

. The shell supports six main commands including help, id, shell execution via execvp, kill for terminating processes by PID, upload for file exfiltration through Telegram's attach mechanism, and stop to halt implant execution. Researchers also identified signs of a potential seventh command called "focus," though its functionality remains undetermined. The malware achieves persistence through a LaunchAgent using the label "com.apple.system.services.activity" in its .plist file

1

.

Python-Based Information Stealer Harvests Sensitive Data

Embedded within the malware is a 6.6 KB Base64-encoded Python-based information stealer that functions as a comprehensive data collection suite

1

. This component harvests Terminal command histories, installed application listings, snapshots of running processes, system hardware and software profiles, macOS Keychain database contents, and browsing data from Chrome, Brave, Firefox, and Safari. The collected information is compressed into a ZIP archive at "temp/collected_data.zip" and exfiltrated via Telegram. The Python stealer is deployed through a separate 2 KB Base64-encoded bash installer that drops a cpython-3.10.18 interpreter from the "astral-sh/python-build-standalone" project, with emojis and extensive comment headers suggesting it was likely generated using a large language model

1

.

Operational Security Features Target Detection Evasion

Gaslight demonstrates sophisticated operational security measures that distinguish it from typical malware families. Rather than hard-coding configuration details, the implant receives its bot token, chat ID (tg_room_id), and operator configuration at runtime

1

. The malware self-redacts its Telegram bot token in its own runtime output, preventing anyone who captures logs or crash artifacts from accessing this critical information. When two instances of the same bot token poll simultaneously, the system issues a "Conflict" response that causes the second copy to terminate, preventing detection through duplicate processes

1

.

Implications for AI Security Workflows and Future Threats

While SentinelOne did not demonstrate whether the technique successfully bypasses existing AI malware analysis platforms, the findings indicate threat actors are actively experimenting with anti-analysis methods targeting AI-driven security tools. This development matters because cybersecurity researchers increasingly rely on AI-powered tools to assist with malware analysis and reverse engineering. The emergence of techniques specifically designed to manipulate AI triage agents suggests defenders must anticipate adversarial techniques that exploit the growing integration of large language models in security operations. Organizations using AI-assisted analysis should watch for similar prompt injection attempts and consider implementing validation layers that can distinguish between genuine system messages and fabricated content designed to manipulate automated analysis pipelines.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved