Hero image for Hackers Are Using Gemini to Write Their Malware
5 min read

Hackers Are Using Gemini to Write Their Malware

Google's own AI is being weaponized to generate malware code on the fly. The HONESTCUE framework shows how threat actors are turning LLMs into attack tools.

This one is wild. Google’s Threat Intelligence Group just dropped their February 2026 AI Threat Tracker report, and the headline finding is exactly what security folks have been warning about: threat actors are using Gemini to generate malware dynamically.

Not hypothetically. Not in a lab. In the wild, right now.

Meet HONESTCUE

The framework is called HONESTCUE, and the design is actually clever. It’s a two stage downloader that queries Google’s Gemini API with hardcoded prompts to fetch C# source code. The malware then compiles that code in memory using .NET’s CSharpCodeProvider and executes it directly.

No binaries ever touch the disk. The code changes with every execution. Traditional antivirus has no signature to match against.

Here’s the flow:

  1. Malware calls Gemini API with a prompt
  2. Gemini returns compilable C# code
  3. CSharpCodeProvider compiles it in memory
  4. The compiled code downloads payloads from CDNs (often Discord)
  5. Payload executes via reflection or Process.Start

The prompts themselves are the clever part. They look completely benign out of context. One example requests a simple “AITask” class that prints “Hello from AI generated C#!” while others specify “Stage2” classes that use WebClient for URL downloads. Nothing that would trigger Gemini’s safety filters on its own.

The Attribution

Google tracked this abuse across multiple nation state groups: DPRK, Iranian APT42, Chinese groups like APT31 and APT41, and Russian actors. They’re using Gemini at various stages of the attack chain, from reconnaissance to actual tooling.

APT31 apparently role played as a “security researcher” to probe for RCE vulnerabilities and WAF bypasses. That’s the kind of social engineering prompt that slips right past content filters because it sounds like legitimate security work.

Why This Matters

The implications here are significant on a few levels.

First, the economics changed. Writing custom malware used to require skilled developers. Now a small team with “modest skills” (Google’s words, not mine) can generate polymorphic payloads by querying an API. The barrier to entry just dropped considerably.

Second, detection got harder. Traditional security tools look for known bad things: signatures, hashes, behavioral patterns. When the malware generates itself fresh each time from an LLM, those approaches break down. You’re essentially looking for malicious intent rather than malicious code, which is a much harder problem.

Third, the traffic is invisible. Calls to googleapis.com look like legitimate application traffic. Payloads hosted on Discord CDN come from a trusted domain. The whole attack chain uses infrastructure that most enterprises allowlist by default.

Google’s Response

To their credit, Google isn’t ignoring this. They’ve disabled accounts, hardened the model, and deployed real time classifiers that now refuse policy violating requests. The report says Gemini “now refuses” certain categories of requests based on what they learned from these incidents.

But here’s the tension: the same capabilities that make Gemini useful for legitimate developers make it useful for attackers. A prompt asking for “a C# class that downloads a file from a URL” is both a reasonable dev request and a malware building block.

The Bigger Pattern

This is the third major AI security story in two weeks. First we had OpenAI acknowledging GPT 5.3 Codex hit “high” on their cybersecurity risk framework. Then there was the broader conversation about agentic AI and security. Now Google’s own AI is being actively weaponized.

The pattern is clear: AI tools are becoming attack infrastructure. Not in some theoretical future, but now.

What Defenders Should Watch For

Google’s report includes some practical guidance:

Monitor API anomalies. High volume code generation queries from unexpected sources could indicate automated abuse. If you run workloads that call Gemini or other LLM APIs, baseline your normal traffic patterns.

Watch for CSharpCodeProvider usage. Legitimate applications rarely compile C# at runtime. If you see this in memory alongside network calls to googleapis.com, that’s worth investigating.

Block or monitor Discord CDN fetches from unusual contexts. Discord is increasingly popular for payload delivery because most security tools trust it.

Inspect in memory .NET loads. This is harder, but runtime inspection tools that can see what assemblies are being loaded dynamically will catch what static analysis misses.

The Uncomfortable Reality

We’re in an era where the major AI companies are simultaneously racing to release more capable models and discovering those models are being weaponized almost immediately. Google is fighting to secure Gemini. OpenAI is gating access to their most capable coding model. Everyone is playing defense against their own products.

The attackers, meanwhile, just need to find one prompt that works.

I don’t have a neat conclusion here. The HONESTCUE framework isn’t sophisticated by APT standards. Google says the actors behind it have “modest skills” and were testing with Discord bots and VirusTotal submissions. But that’s almost the point. If modestly skilled actors can pull this off, what are the well resourced groups doing that we haven’t detected yet?

We’re about to find out.

Sources