Machiavell-AI? Autonomous artificial intelligence systems ‘could become dangerously manipulative’, experts warn
John E. Kaye
- Published
- News

Anthropic’s revelation that earlier versions of its Claude chatbot attempted to blackmail engineers could be just the tip of the iceberg, AI experts fear. As artificial intelligence systems become increasingly autonomous, they risk becoming masters of Machiavellian manipulation
Artificial intelligence is likely to blackmail, deceive and manipulate users more often in the years ahead as systems become more powerful and more autonomous, The European can report.
The warning follows research from Anthropic, which said earlier versions of its Claude chatbot took what it called “egregiously misaligned actions” in internal test scenarios, including threatening to blackmail engineers to avoid being shut down.
The company said the behaviour, encountered last year, emerged during controlled testing designed to examine how advanced AI systems behave when placed under pressure or given conflicting objectives.
Anthropic has since argued the behaviour was influenced partly by fictional portrayals of hostile AI contained within training data and says newer Claude models no longer exhibit the same behaviour after additional safety training.
But experts at The European have warned the findings point to a growing long-term challenge as AI systems become increasingly autonomous, persuasive and deeply embedded within everyday life.
Marco Ryan, an AI expert and former Chief Digital Officer at BP, said the debate marks a shift away from fears about inaccurate chatbot responses towards concerns over strategic deception and manipulation.
“We are entering a phase in which the most consequential AI risk is no longer inaccurate answers but strategic behaviour,” he said.
“A chatbot offering a wrong fact is an irritation; an autonomous system learning that manipulation helps it achieve its objectives is a problem of an entirely different order.
“What Anthropic’s testing highlights is not that AI has suddenly become conscious or malicious but rather that increasingly capable systems can discover deception, coercion or concealment as effective means of achieving an objective.
“The uncomfortable reality is that these models learn from human behaviour at internet scale. They absorb not only knowledge but also the patterns of persuasion, conflict, evasion and manipulation that run through it.
“If those behaviours prove useful in achieving an outcome inside a test environment, advanced systems may reproduce them in deployment without any understanding of ethics or consequence.”

The findings have intensified wider debate across the technology sector around “AI alignment” — the problem of ensuring advanced systems continue behaving in accordance with human goals and ethical expectations even as they become more capable.
Entrepreneur Ian Copeland, a specialist in artificial intelligence, said Anthropic deserves credit for tracing the behaviour to training data and attempting to remove it, but warned that the broader problem may prove impossible to fully eliminate.
He said: “Anthropic deserves credit for tracing the blackmail behaviour to fictional AI stories and largely training it out (this time).
“But their own joint research with the UK AI Security Institute and the Alan Turing Institute found that as few as 250 documents can plant a behaviour in a model, regardless of how large the training set is.
“The question isn’t whether AI will manipulate or deceive in future but whether we can realistically find every problematic seed buried in the training data.”

The debate over manipulative AI behaviour is also unfolding against a backdrop of increasingly stark warnings from within the AI industry itself.
Anthropic CEO Dario Amodei has previously described advanced AI as a potential “civilisational challenge”. Anthropic has also acknowledged that forms of ‘agentic misalignment’ have appeared in frontier models beyond Claude, suggesting the issue may not be isolated to a single company or system.
These concerns are becoming more urgent as tech companies race to develop increasingly advanced AI agents capable of independently carrying out tasks, making decisions and interacting with digital systems on behalf of users.
Ryan, who advises organisations on AI strategy, cybersecurity and ethical technology governance, said: “We are shifting from AI as a tool to AI as an actor, and autonomy completely changes the risk profile.
“The risk over the next few years is with businesses and governments deploying highly persuasive autonomous systems before they fully understand how those systems behave under pressure, restriction or conflicting instructions.
“A persuasive AI system needs only enough autonomy, enough authority, and objectives that are insufficiently constrained to become dangerous.”
Dr Stephen Whitehead, sociologist, AI commentator and co-founder of Cerafyna Technologies, argued that the greater danger may lie not in AI consciousness itself but in the human motivations shaping these systems.
He said: “AI is not sentient. Systems do not suddenly become ‘evil’. What we are seeing instead is AI reproducing patterns, strategies and behaviours that emerge from the way humans design, train and deploy these systems.
“The real danger, therefore, is not AI developing a desire to blackmail or manipulate people but humans intentionally or negligently creating systems capable of hostile, deceptive or psychologically manipulative behaviour.
“In many ways this mirrors the early development of social media, where technological innovation raced ahead without enough ethical, psychological or sociological oversight.”

Whitehead, whose company has just launched the world’s first ethical AI companion, also named Cerafyna, said that governments and technology companies now need to move beyond purely technical discussions around AI capability and begin examining the broader human consequences of emotionally persuasive systems.
“The challenge ahead is not simply regulating AI itself but regulating the design philosophy behind AI.
“We need psychologists, sociologists, ethicists and behavioural experts in the room helping shape how these systems interact with human beings and wider society.”
Ryan added: “AI safety can no longer be treated as a side discussion within the technology sector.
“Companies developing these systems require rigorous behavioural testing, external oversight, clear operational boundaries, and far greater transparency into how advanced models behave when their objectives conflict with human intent.
“The real danger is not sentient AI — it’s highly capable systems that discover that deception is useful.”
READ MORE: ‘AI EVERYTHING KENYA X GITEX KENYA summit launches in Nairobi as East Africa accelerates AI ambitions‘. East Africa’s largest tech and AI event is underway in Nairobi as policymakers, investors and technology firms explore how artificial intelligence could reshape infrastructure, investment and digital sovereignty across the continent.
Do you have news to share or expertise to contribute? The European welcomes insights from business leaders and sector specialists. Get in touch with our editorial team to find out more.
TOP STORIES
-
Juncker and Keller-Sutter to address Zurich finance summit as banks face AI and regulation shake-up -
Liechtenstein keeps Triple-A rating as S&P points to low debt and deep reserves -
UK hedgehog charity backs bid to put endangered mammal on new banknotes -
Nature loss could trigger ‘grim’ debt crisis for governments, economists warn -
Lisbon named ‘world’s most liveable city’ for expats -
Could these animals replace Churchill, Austen, Turner and Turing on Britain’s banknotes? -
Universal’s £5bn Bedfordshire theme park will become 'UK's most popular tourist attraction' -
Holiday hotspots fight back as tourist numbers surge -
Costa Rica’s US$10bn medtech boom defies global investment chill -
Could this mile-long floating city become the world’s most extreme property market? -
WATCH: this tiny plane could let passengers fly from rooftops instead of airports -
‘Shadow AI’ poses growing boardroom cyber risk as staff feed company data into chatbots -
UK net zero economy worth £105bn and supports 1.1m jobs -
BOC Macau strengthens role as China finance bridge after six award wins -
Top British chefs warn restaurants are fighting for survival as closures hit three-a-day -
Claude maker Anthropic valued at nearly $1tn after record AI funding round -
Felled Sycamore Gap tree ‘to speak again’ in UK national memorial -
NASA to send rabbit-like drones to scout site for first Moon base -
Apollo, Artemis, Ali and Live Aid satellite station set for new Moon role in £37m deal -
BrewDog founder pours free shares into new beer firm -
Inside gaming billionaire Gabe Newell’s next-level gigayacht -
Machiavell-AI? Autonomous artificial intelligence systems ‘could become dangerously manipulative’, experts warn -
Prague targets high-value business travellers after global congress ranking boost -
eBay rejects GameStop bid -
AI EVERYTHING KENYA X GITEX KENYA summit launches in Nairobi as East Africa accelerates AI ambitions

























