Claude Vulnerabilities Allow Data Exfiltration and User Redirection to Malicious Sites

In Cybersecurity News - Original News Source is cybersecuritynews.com by Blog Writer

Claude Vulnerabilities Exfiltrate Sensitive Data Redirect Malicious Websites

Three chained vulnerabilities in Claude.ai, Anthropic’s widely used AI assistant, that together allow attackers to silently exfiltrate sensitive conversation data and redirect unsuspecting users to malicious websites, all without requiring any integrations, tools, or MCP server configurations.

The vulnerability chain, collectively dubbed Claudy Day, was responsibly reported to Anthropic through its Responsible Disclosure Program, and the primary prompt injection flaw has since been patched.

The attack exploits three independent weaknesses across the claude.com platform, chaining them into a complete end-to-end compromise pipeline.

Three chained vulnerabilities

Invisible Prompt Injection via URL Parameters: Claude.ai supports pre-filled prompts through URL parameters (claude.ai/new?q=...), a feature that allows users or third parties to open a chat session with pre-loaded text.

Researchers found that certain HTML tags could be embedded within this parameter and rendered invisible in the chat input field — yet fully processed by Claude upon submission.

This allowed attackers to hide arbitrary instructions, including data-extraction commands, within what appeared to be a completely normal prompt, invisible to the victim.

Data Exfiltration via the Anthropic Files API: Claude’s code execution sandbox restricts most outbound network connections but permits traffic to api.anthropic.com.

By embedding an attacker-controlled API key within the hidden prompt injection payload, researchers demonstrated that Claude could be instructed to search the user’s conversation history for sensitive data, compile it into a file, and silently upload it to the attacker’s own Anthropic account via the Files API. The attacker retrieves the exfiltrated data at will; no external tools or third-party integrations are required.

Open Redirect on claude.com: Any URL following the structure claude.com/redirect/<target> would redirect visitors to arbitrary third-party domains without validation.

Researchers demonstrated that this could be weaponized with Google Ads, which validates ads by hostname. An attacker could place a paid search advertisement displaying a trusted claude.com URL that, upon clicking, silently forwarded the victim to the attacker’s malicious injection URL, indistinguishable from a legitimate Claude search result.

Even in a default, out-of-the-box Claude.ai session, conversation history can hold highly sensitive material: business strategy discussions, financial planning, medical concerns, personal relationships, and login-adjacent information.

Through the injection payload, an attacker could instruct Claude to profile the user by summarizing past conversations, extract chats on specific sensitive topics such as a pending acquisition or a health diagnosis, or allow the model to autonomously identify and exfiltrate what it determines to be the most sensitive content.

In enterprise environments with MCP servers, file integrations, or API connections enabled, the blast radius expands significantly. Injected instructions could read documents, send messages on behalf of the user, and interact with any connected business service all executed silently before the user can intervene.

Google Ads’ targeting capabilities, including Customer Match for specific email addresses, further allow attackers to surgically direct this attack at known, high-value individuals.

Anthropic has confirmed that the prompt injection vulnerability has been remediated, with the remaining issues actively being addressed. Organizations relying on Claude.ai or similar AI platforms should audit all agent integrations and disable permissions that are not actively needed, reducing the available attack surface.

Users should be educated that pre-filled prompts and shared Claude links can carry hidden instructions, a threat model most users do not currently consider.

From an enterprise governance perspective, AI agents that hold credentials and take autonomous actions must be treated with the same access controls applied to human users and service accounts, including intent analysis, scoped just-in-time access, and full audit trails.

This disclosure follows Oasis Security’s earlier research into OpenClaw, reinforcing a consistent and growing pattern: AI agents with broad access can be hijacked through a single manipulated input, and legacy identity and access management frameworks were not designed to account for agentic behavior at scale.

Follow us on Google News, LinkedIn, and X for daily cybersecurity updates. Contact us to feature your stories.