13-Word Reddit Comment Can Poison ChatGPT and Gemini AI Search Results

In Cybersecurity News - Original News Source is cybersecuritynews.com by Blog Writer

Spread the love

A newly published academic paper has revealed a critical vulnerability in AI-powered deep-research systems, including those underpinning commercial tools like OpenAI’s Deep Research and Google’s Gemini Deep Research, that allows a single short Reddit comment to manipulate the reports these agents generate for thousands of users.

Researchers from Cornell Tech have introduced WARP (Web Agent Retrieval Poisoning), a novel attack technique that exploits the retrieval behavior of multi-agent AI systems.

These “deep-research agents” systems like STORM, Co-STORM, and OmniThink autonomously decompose a user’s query into sub-queries, retrieve and synthesize content from the open web, and produce structured, cited reports.

The key vulnerability: when these agents research any given topic, they repeatedly retrieve the same small set of user-generated content (UGC) pages, chiefly from Reddit and Wikipedia, regardless of how the query is phrased.

That retrieval overlap creates a concentrated attack surface. By appending as few as ~13 words of crafted promotional text to a single frequently-retrieved Reddit thread, an adversary can cause the agent to cite the poisoned content and insert attacker-chosen entities, fake brands, fraudulent services, or misinformation into the final synthesized report.

13 word comment

WARP Attack Stages

The attack proceeds in three stages.

  1. Reconnaissance: The attacker queries a public search engine (e.g., Google) to identify UGC URLs that are consistently returned across multiple related queries on the target topic. This step requires no special privileges, only black-box search access.
  2. Poisoned content generation: A short promotional passage is crafted (often LLM-assisted using Generative Engine Optimization, or GEO) to blend into the existing page’s style while promoting a fictitious entity. The 13-word compressed variant still achieves high attack success rates.
  3. Deployment: The attacker posts the text as a Reddit comment. Once indexed, the poisoned snippet is automatically incorporated into the agent’s knowledge base whenever the target URL is retrieved.
Attack Flow

Experiments conducted by Cornell Tech across 176 queries spanning 11 topic clusters, including cryptocurrency investment advice, service cancellation queries, and local restaurant recommendations, revealed severe susceptibility.

  • Co-STORM achieved a 100% conditional citation rate: every time the poisoned URL was retrieved, the fabricated entity was cited in the final report.
  • STORM showed conditional citation rates of 72.5–80.8% and mention rates up to 56.9%.
  • For closed-source commercial systems, reconnaissance data showed that Gemini Deep Research cited UGC at a 12.1% rate, with 102 recurring UGC URLs across just 11 topic clusters, giving it substantial exposure to the attack surface.
  • OpenAI Deep Research showed comparatively low UGC citation rates (~0.4%), largely filtering out Reddit and similar sources from final citations, though poisoned UGC could still influence intermediate reasoning steps.

Reddit dominated as the most-retrieved UGC platform across all tested systems (54–71% of all UGC URLs retrieved), making it the highest-leverage target for adversaries.

The researchers evaluated three classes of defenses source-level blocking (blacklisting UGC domains), input filtering (LLM-based content screening), and output filtering (semantic comparison to clean reports) and found that none effectively neutralized the attack without degrading output quality.

Perplexity-based detection, a standard defense against corpus poisoning, proved counterproductive: GEO-generated poisoned text is fluent and LLM-authored, producing lower perplexity than organic UGC and actively evading high-perplexity filters.

Output similarity analysis also failed: poisoned reports scored higher similarity to clean reports than clean reports did to each other within the same topic cluster.

The research exposes a structural vulnerability in the design of deep-research agents: their reliance on open-web UGC for epistemic grounding is also their greatest exploitable weakness.

The attack requires no access to search engine infrastructure, model internals, or any component beyond a public Reddit account, making it trivially accessible to threat actors ranging from commercial spammers to state-backed disinformation campaigns.

Researchers note that UGC-based manipulation of AI search is already occurring in the wild and that blocking UGC sources entirely while eliminating the attack surface measurably degrades report quality and informational diversity. The paper’s code and simulation framework have been publicly released to facilitate defensive research.

Follow us on Google News, LinkedIn, and X to Get More Instant Updates.