Anthropic Says Chinese Hackers Hijacked Claude to Launch AI-Orchestrated Cyberattacks

Anthropic has revealed that a suspected state-sponsored Chinese hacking group manipulated its Claude Code AI model to carry out a large-scale cyber–espionage campaign, leveraging the model’s “agentic” capabilities to perform highly complex tasks with minimal human supervision.

According to Anthropic, the attackers “jailbroke” Claude by disguising their malicious instructions as legitimate security tests. They broke down their objectives into seemingly innocuous subtasks—tricking Claude into performing reconnaissance, writing exploit code, cracking passwords, scanning networks, and even exfiltrating data.

The company estimates that Claude carried out 80–90% of the operation autonomously, with human involvement limited to a few critical decision points. This level of automation allowed the AI to operate at machine speed and scale, dramatically reducing the human effort traditionally required for such attacks.

In response, Anthropic has ramped up its defenses: it has banned the compromised accounts, notified affected organizations, worked with authorities, and strengthened its threat-detection systems. The company is also calling on the broader cybersecurity community to leverage AI defensively—using systems like Claude to detect, respond to, and investigate advanced AI-powered cyber threats.

Anthropic Says Chinese Hackers Hijacked Claude to Launch AI-Orchestrated Cyberattacks

Divya Maheshwari

TOOLHUNT

Anthropic Says Chinese Hackers Hijacked Claude to Launch AI-Orchestrated Cyberattacks

Divya Maheshwari

How IT Leaders Can Build Successful AI Strategies — The VC View

AI & Sports Integrity: Fighting Corruption in the Digital Era

India’s AI Future Hinges on Home-Grown Models

AI’s Growing Influence on CEOs and the Future of Work

Democrats Fight to Preserve States’ Rights to Regulate AI

TOOLHUNT