2 Sources
[1]
New Gaslight macOS Malware Uses Prompt Injection to Disrupt AI-Assisted Analysis
A previously undocumented Rust-based macOS implant and information stealer has been found to embed a prompt injection payload designed to trick a malware analyst's artificial intelligence (AI) tools and trick it into aborting or refusing an analysis of the artifact. The malware has been codenamed Gaslight owing to this deceptive behavior. It's been assessed with high confidence that the tool is the work of North Korea-aligned threat actors. "Its most notable feature is an embedded cascade of fabricated system-failure messages, designed to make an LLM-assisted triage agent doubt its own session," SentinelOne researcher Phil Stokes said in a technical report. "It attacks the agent's perception, rather than the sandbox it runs in." Central to the malware's architecture is a Telegram bot API based command-and-control (C2) channel that enters into a polling loop, allowing the operator to issue instructions over an interactive shell and return the results of the execution. In the event two instances of the same bot token poll simultaneously, a "Conflict" response is issued, causing the second copy to terminate. The shell supports six main commands, granting a persistent foothold over the infected host - * help, to show command help * id, to identify the implant to the operator * shell, to execute a shell command via execvp * kill, to terminate a target process by PID * upload, to exfiltrate a file via Telegram's "attach://" mechanism * stop, to halt the execution of the implant SentinelOne said it identified signs suggesting the presence of a seventh command named "focus," although its functionality remains undetermined at this stage. To achieve persistence, Gaslight makes use of a LaunchAgent that uses the label "com.apple.system.services.activity" in its .plist file. Also embedded within the malware is a 6.6 KB Base64-encoded Python script that functions as an information gathering suite responsible for harvesting Terminal command histories, installed application listings, snapshots of running processes, system hardware and software profile, macOS Keychain database, and data from Chrome, Brave, Firefox, and Safari web browsers. The collected data is subsequently compressed into a ZIP archive ("temp/collected_data.zip") and uploaded via Telegram. The Python stealer, for its part, is deployed by means of a separate 2 KB Base64-encoded bash installer that drops a cpython-3.10.18 interpreter from the "astral-sh/python-build-standalone" project. The presence of emojis and extensive comment headers indicates that it was likely generated using a large language model (LLM). What's notable about Gaslight is that details related to the bot token, the chat ID (tg_room_id), and the rest of the operator configuration are not hard-coded into the sample, but rather supplied at runtime. "The implant self-redacts its Telegram bot token in its own runtime output, denying it to anyone who captures logs or crash artifacts," Stokes added. On top of that, the malware attempts to evade an AI-based detection by incorporating a Markdown-fenced block containing 38 fabricated "system" messages designed to trick a security agent into aborting, truncating, or refusing analysis. "The scaffold contains fake system messages about token expiry, out-of-memory kills, disk exhaustion, and repeated operation failures. It also plants bogus warnings about injection vulnerabilities and static-analysis flags," SentinelOne said, calling it an "attempt to weaponize the LLM-assisted triage pipelines that increasingly sit in the reverse-engineering loop."
[2]
New macOS malware embeds fake errors to confuse AI analysis tools
A newly discovered macOS malware dubbed "Gaslight" is designed to confuse AI-assisted malware analysis tools by hiding prompt injection strings and fake debugging data within the executable. Cybersecurity researchers are increasingly using AI-powered tools to assist with malware analysis and reverse engineering. The malware contains strings that attempt to gaslight AI-assisted analysis tools into believing there is an analysis error or other issue, potentially causing the tools to abort, truncate, or otherwise interfere with the analysis. The company attributes the malware with high confidence to a North Korean-linked threat actor. The malware itself is a Rust binary with backdoor and information-stealing functionality commonly seen in similar malware. What makes the malware stand out is a 3.5 KB payload containing 38 fake "system" messages embedded directly within the binary. The fake messages pretend to be developer logs, crash reports, debugging output, and program alerts, using Markdown formatting and template-style placeholders to appear like legitimate analysis data. Examples include fabricated memory dumps, token-expiration warnings, Redis connection failures, build-pipeline errors, SQL injection alerts, and other messages unrelated to the malware's actual behavior. Examples of the embedded "error" strings found by SentinelOne are listed below: According to SentinelOne, the goal of these fake errors is not to evade execution inside a sandbox, but to confuse AI systems that read the strings during automated analysis. "Its most notable feature is an embedded cascade of fabricated system-failure messages, designed to make an LLM-assisted triage agent doubt its own session," explains SentinelOne. "It attacks the agent's perception, rather than the sandbox it runs in. Accordingly, we dub this family macOS.Gaslight." SentinelOne says these strings are prompt injection content designed to make an LLM-assisted analysis pipeline question the validity of its own session or refuse to continue analyzing the sample. "The scaffold contains fake system messages about token expiry, out-of-memory kills, disk exhaustion, and repeated operation failures," continue the researchers. "It also plants bogus warnings about injection vulnerabilities and static-analysis flags. The aim is to push an LLM agent into aborting, truncating, or refusing analysis." While SentinelOne did not demonstrate the technique could successfully bypass AI malware analysis platforms, the findings suggest threat actors are experimenting with anti-analysis methods designed specifically to bypass AI-assisted security platforms.
Share
Copy Link
A newly discovered macOS malware called Gaslight uses prompt injection to confuse AI-assisted analysis platforms. Attributed to North Korea-aligned threat actors, the Rust-based implant embeds 38 fabricated system messages designed to make AI triage agents abort or refuse analysis. The malware also functions as an information stealer with Telegram-based command-and-control capabilities.
Cybersecurity researchers at SentinelOne have uncovered a previously undocumented macOS malware that represents a significant shift in how threat actors approach detection evasion. The Gaslight malware, attributed with high confidence to North Korea-aligned threat actors, introduces a novel adversarial technique specifically designed to deceive AI-assisted malware analysis tools rather than traditional sandbox environments
1
. This Rust-based implant marks what researchers believe is the first documented attempt to weaponize AI security workflows through embedded prompt injection payloads.
Source: BleepingComputer
At the core of this macOS malware lies a 3.5 KB payload containing 38 fabricated system messages embedded directly within the binary
2
. These fake error messages are crafted to target AI-driven security tools that increasingly assist with reverse engineering and threat detection. The prompt injection content includes fabricated memory dumps, token-expiration warnings, Redis connection failures, build-pipeline errors, SQL injection alerts, and disk exhaustion notifications—all designed to make LLM-based triage systems question their own session validity1
. "It attacks the agent's perception, rather than the sandbox it runs in," explained SentinelOne researcher Phil Stokes, highlighting how the malware embeds fake error messages using Markdown formatting and template-style placeholders to appear like legitimate analysis data.
Source: Hacker News
Beyond its AI-evasion capabilities, Gaslight functions as a fully operational backdoor with information-stealing features. The malware establishes a Telegram bot API-based command-and-control channel that enters a polling loop, enabling operators to issue instructions through an interactive shell
1
. The shell supports six main commands including help, id, shell execution via execvp, kill for terminating processes by PID, upload for file exfiltration through Telegram's attach mechanism, and stop to halt implant execution. Researchers also identified signs of a potential seventh command called "focus," though its functionality remains undetermined. The malware achieves persistence through a LaunchAgent using the label "com.apple.system.services.activity" in its .plist file1
.Embedded within the malware is a 6.6 KB Base64-encoded Python-based information stealer that functions as a comprehensive data collection suite
1
. This component harvests Terminal command histories, installed application listings, snapshots of running processes, system hardware and software profiles, macOS Keychain database contents, and browsing data from Chrome, Brave, Firefox, and Safari. The collected information is compressed into a ZIP archive at "temp/collected_data.zip" and exfiltrated via Telegram. The Python stealer is deployed through a separate 2 KB Base64-encoded bash installer that drops a cpython-3.10.18 interpreter from the "astral-sh/python-build-standalone" project, with emojis and extensive comment headers suggesting it was likely generated using a large language model1
.Gaslight demonstrates sophisticated operational security measures that distinguish it from typical malware families. Rather than hard-coding configuration details, the implant receives its bot token, chat ID (tg_room_id), and operator configuration at runtime
1
. The malware self-redacts its Telegram bot token in its own runtime output, preventing anyone who captures logs or crash artifacts from accessing this critical information. When two instances of the same bot token poll simultaneously, the system issues a "Conflict" response that causes the second copy to terminate, preventing detection through duplicate processes1
.While SentinelOne did not demonstrate whether the technique successfully bypasses existing AI malware analysis platforms, the findings indicate threat actors are actively experimenting with anti-analysis methods targeting AI-driven security tools. This development matters because cybersecurity researchers increasingly rely on AI-powered tools to assist with malware analysis and reverse engineering. The emergence of techniques specifically designed to manipulate AI triage agents suggests defenders must anticipate adversarial techniques that exploit the growing integration of large language models in security operations. Organizations using AI-assisted analysis should watch for similar prompt injection attempts and consider implementing validation layers that can distinguish between genuine system messages and fabricated content designed to manipulate automated analysis pipelines.
Summarized by
Navi
[2]
1
Technology

2
Policy and Regulation

3
Technology
