When AI customer service fails, don’t blame technology — it’s leadership at fault

Businesses are rushing to automate customer service with AI chatbots that promise faster responses and major cost savings. But when automated systems are deployed without clear human oversight or escalation paths, errors can escalate quickly and customers are left trapped in systems that cannot resolve their problems, writes Marco Ryan

The commercial incentives for automating customer service are enormous, with AI chatbots handling interactions at a fraction of the cost of human agents. 

But a recent personal experience — and mounting evidence from Klarna, Gartner and MIT — suggests that many companies are deploying agentic AI without the leadership intelligence to get it right. 

The result is not efficiency but frustration, and the consequences for customer loyalty and brand trust are becoming impossible to ignore.

The commercial case for AI in customer service has never been stronger. According to a report from Gartner, the average customer service interaction with a human agent costs around $8, while the equivalent handled by a chatbot costs roughly 10 cents. IBM estimates that AI can reduce customer service costs by up to 30 percent. The global chatbot market, currently valued at around $15billion, is projected to reach roughly $47billion by 2030, driven by the promise of faster responses, personalised interactions at scale and the ability to handle routine queries instantly.

When implemented well, AI can improve customer service by reducing wait times and freeing skilled human agents to focus on complex or sensitive issues.

What the commercial case does not capture is what happens when implementation goes wrong. A 2024 survey from Gartner, based on responses from nearly 6,000 customers, found that 64 percent would prefer companies not to use AI in customer service at all. 

More significantly, 53 percent said they would consider switching to a competitor if they discovered a company was relying on AI to handle their queries. The primary concern, cited by 60 percent of respondents, was that AI would make it harder to reach a human when something went wrong.

That concern is well founded.

A recent experience with DPD illustrates what can happen when automated systems operate without adequate design, oversight or human fallback.

The tracking system indicated my package was next on the delivery route. I waited. The driver then marked the delivery as attempted, citing “nobody home”, despite the fact that I had been at home throughout, standing by the door with the tracker open on my phone. Nobody came. Nobody knocked. Before I had time to respond, the system executed what it described as an “operational error” and flagged the package for immediate return to the sender.

One alleged failed delivery had been logged as “multiple failed attempts”, and that single misrecording triggered an irreversible automated process.

There was no intervention point, no human review and no pause in the workflow where someone with functioning judgement might have questioned whether the data matched reality. The system simply executed its logic. That the logic was based on false information was irrelevant to the algorithm.

The subsequent contact experience proved even more revealing. Much of DPD’s customer service is automated, and when I entered my tracking number the chatbot confirmed that my package was being returned due to “multiple failed delivery attempts” — a fiction generated by one system and now treated as fact by another.

The bot apologised. I explained the situation. It apologised again. I asked to speak to a human. The bot explained that if my issue was “deemed serious enough”, it would escalate the case — the severity of the problem to be determined by the same system that had already accepted incorrect delivery data.

This is precisely where AI struggles in customer-facing environments. It can process words, but it cannot interpret tone, recognise frustration or distinguish between a routine query and a customer whose trust in a company is collapsing. Human judgement — the ability to listen not only to what is said but how it is said — remains essential in customer service.

DPD is no stranger to this type of failure. In January 2024 the company’s chatbot attracted global attention after a frustrated customer prompted it to swear, write poetry about how terrible the company was and describe DPD as “the worst delivery firm in the world.” The incident, widely shared on X and viewed more than a million times, led to the chatbot being temporarily disabled while the company addressed what it described as a system update error.

My own experience, more than a year later, suggests that some of the underlying lessons have yet to be fully absorbed.

It took several hours of circular chatbot exchanges before I managed to reach a human agent, and only after mentioning that I was a writer and that the experience might appear in an article. The human agents, it turned out, were there all along — hidden behind the algorithm and accessible only when the cost of withholding them exceeded the cost of providing them.

Even then, the escalation process stalled. It took five hours before a manager called, and only because I insisted on speaking with a human.

Experiences like this are not isolated. In 2024, the Swedish fintech company Klarna made global headlines by announcing that an AI chatbot developed with OpenAI would replace around 700 customer service agents.

The initial results appeared impressive. The system handled roughly two-thirds of all customer chats — around 2.3million conversations — and the company projected a $40million improvement in profits. Within the industry, Klarna was widely cited as proof that AI could operate customer-facing services at scale.

By mid-2025 the picture had changed. Customer satisfaction had fallen, service quality had become inconsistent and complaints about robotic responses and unresolved issues had grown. The company began quietly rehiring human agents. Klarna’s CEO, Sebastian Siemiatkowski, later acknowledged that the company had focused too heavily on efficiency and cost reduction at the expense of service quality.

Sebastian Siemiatkowski, co-founder and chief executive of Klarna, has said the company’s heavy focus on cost reduction in its AI rollout led to lower service quality, prompting a shift back towards human customer support. Credit: TechCrunch (CC BY 2.0)


In both cases the systems behaved exactly as they had been designed to behave. The problem lay in the strategy, the process design and the leadership decisions that shaped how the technology was deployed.

Research supports this pattern. A 2025 study by the Massachusetts Institute of Technology NANDA initiative examined more than 300 enterprise AI deployments and found that around 95 percent of generative AI pilots produced no measurable return on investment despite tens of billions of dollars in global enterprise spending.

The researchers pointed to familiar problems: misaligned priorities, poor integration, weak governance and a persistent gap between boardroom expectations and operational reality.

The technology is advancing rapidly. Whether leadership is advancing at the same pace remains an open question.

Much has been written about IQ and EQ as leadership competencies. The missing piece is what might be called digital intelligence — the ability to understand how AI systems actually work, where they create value and where human judgement must remain in control.

Digital intelligence is not about writing code or configuring systems. It is about strategic understanding: recognising where automation improves the customer experience and where removing the human creates unacceptable risk.

Before deploying AI, leaders must decide which customer interactions are genuinely routine and which require empathy, judgement or discretion. During implementation, systems need to be designed around real customer journeys rather than bolted onto broken processes. After launch, organisations must monitor how those systems behave in the real world, because edge cases accumulate and dashboards rarely capture the full customer experience.

The irony of my experience with DPD is that well-designed AI could have made the situation dramatically better.

A robust system might have detected inconsistencies between delivery data and GPS information before triggering a return process. A well-designed chatbot might have recognised escalating frustration and connected me to a human agent within minutes, transferring the full context of the problem. A properly designed follow-up process might have ensured that a manager called with the authority to resolve the issue quickly.

That is what responsible AI looks like in practice: automation supporting human judgement rather than replacing it.

Companies that define the next generation of customer experience will not necessarily be those that automate the fastest. They will be those that deploy AI thoughtfully and design systems where technology handles the predictable while humans manage complexity, empathy and escalation.

There is enormous potential for agentic AI across customer service and many other areas of business. The commercial incentives are clear and the technology is capable of delivering real value. But the benefits depend on leadership decisions about how the technology is designed, governed and integrated into real operations.

The question is no longer whether companies should deploy AI. That decision has already been made across most industries. The real question is whether leaders possess the digital intelligence to deploy it responsibly.

Without that leadership discipline, powerful technologies risk becoming expensive experiments that frustrate customers, damage trust and undermine the very efficiencies they were meant to deliver.

Agentic AI is a powerful tool, but its impact depends on the judgement of the people who design and deploy it.


Marco Ryan is a board-level advisor, author and former FTSE 100 executive specialising in digital transformation, leadership strategy and ethical oversight in the age of AI. He has held senior global roles including Chief Digital Officer at BP, Wärtsilä and Thomas Cook, and now serves as Cyber Leader in Residence at Lancaster University Management School. He is co-author of Rewire or Retire: AI for Leaders, a candid guide to navigating AI’s impact on work, leadership and ethics, and has published widely on cybersecurity and AI literacy for executives. An angel investor, mentor and regular conference speaker, Marco is an active voice in the global conversation on digital intelligence, governance and leadership in an AI-driven world.




READ MORE: ‘AI is rewriting Europe’s networks from the inside out — and the continent isn’t ready‘. Reporting from Mobile World Congress in Barcelona, Marco Ryan finds an industry that has already begun redesigning its networks for the demands of artificial intelligence — raising urgent questions about whether Europe’s fragmented telecoms market and regulatory framework can keep pace.

Do you have news to share or expertise to contribute? The European welcomes insights from business leaders and sector specialists. Get in touch with our editorial team to find out more.

Main Image: Andrea Piacquadio/Pexels

RECENT ARTICLES