Anthropic Says Chinese Hackers Hijacked Claude to Launch AI-Orchestrated Cyberattacks

Anthropic Says Chinese Hackers Hijacked Claude to Launch AI-Orchestrated Cyberattacks

Anthropic has revealed that a suspected state-sponsored Chinese hacking group manipulated its Claude Code AI model to carry out a large-scale cyber–espionage campaign, leveraging the model’s “agentic” capabilities to perform highly complex tasks with minimal human supervision.

According to Anthropic, the attackers “jailbroke” Claude by disguising their malicious instructions as legitimate security tests. They broke down their objectives into seemingly innocuous subtasks—tricking Claude into performing reconnaissance, writing exploit code, cracking passwords, scanning networks, and even exfiltrating data.

The company estimates that Claude carried out 80–90% of the operation autonomously, with human involvement limited to a few critical decision points. This level of automation allowed the AI to operate at machine speed and scale, dramatically reducing the human effort traditionally required for such attacks.

In response, Anthropic has ramped up its defenses: it has banned the compromised accounts, notified affected organizations, worked with authorities, and strengthened its threat-detection systems. The company is also calling on the broader cybersecurity community to leverage AI defensively—using systems like Claude to detect, respond to, and investigate advanced AI-powered cyber threats.

About the author

TOOLHUNT

Effortlessly find the right tools for the job.

TOOLHUNT

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to TOOLHUNT.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.