AI Slop Is a Customer Experience Problem – Is Multimodal the Cure

Hallucinated instructions, walls of text, eroded trust – is your text-only service holding you back?

The Hidden Tax

The concept of customer effort isn’t new in CX. The Customer Effort Score has been a staple of support measurement for over a decade. But the way that effort manifests has changed with the introduction of large language models.

The old failure mode was friction from poor routing or slow response. The new one is cognitive load: the mental work required to interact productively with an AI that sounds confident but isn’t always right.

Lilja describes this as a “hidden tax on every AI interaction, paid by customers.”

As discussed above, these taxable events are familiar to anyone who’s recently used a support chatbot.

While these effort-sapping instances don’t show up on dashboards via containment rates or CSAT scores, they can really add up.

And this increased cognitive load can lead to abandoned sessions, repeat contacts, and a slow erosion of confidence.

Within this effort tax, Lilja points to a specific area that he believes to be one of the most damaging: “AI slop.”

“AI slop is low-quality, generic content produced by AI.”

This slop has the potential to cause serious damage to an organization’s customer service operations.

An AI that tells a customer to press the wrong button can damage their product or, in a worst-case scenario, create a safety issue.

And the liability implications are real. Earlier this year, Woolworths was forced to make adjustments to its AI chatbot after it falsely claimed to have an “angry mother” and presented itself as having personal family experiences.

Other examples include Air Canada having to pay compensation after its chatbot gave incorrect refund information, and a customer convincing DPD’s chatbot to swear and write a poem about “how terrible” a company DPD is.

These aren’t arguments against AI in support, but rather, they’re arguments for AI that’s grounded in something more than language.

Why Text Keeps Failing

The structural problem with text-only AI is that language is, as Lilja puts it, “a tree of possibilities.”

Every sentence carries a range of interpretations, and the AI picks one. When that choice is even slightly wrong, the support interaction goes off course, and the customer bears the cost of correcting it.

Images and visual context work differently. They constrain interpretation. A photograph of a cracked product component, a real-time view of an error light, a video showing exactly which cable is loose – these don’t give the AI a tree of possibilities; they give it a scene, as Lilja explains:

“It’s harder to b******t a human with a false image than with false words.”

“Reality is more constrained. That’s how you bring hallucination rates down – by grounding things in more types of information.”

This is what Lilja refers to as “visual grounding,” and it’s one of six properties that define effective multimodal support:

Enhanced context
Reduced ambiguity
Cross-modal consistency
State awareness
Real-time feedback
Visual grounding

Together, they address the failure modes of text-only AI at a structural level, rather than just patching over them with better prompts.

The Feedback Loop Problem

One failure mode in particular stands out: delayed feedback.

In text-based support, a customer can follow a set of instructions for 10 or 15 minutes before discovering that step three was wrong.

By that point, they’ve potentially made things worse, and they’re starting over from scratch; except now they’re frustrated and have less trust in whatever the AI tells them next.

Real-time visual feedback can address this issue. A video guide that shows a customer cleaning a washing machine’s drain filter can flag immediately if they’re doing it incorrectly. A live visual check on a hardware installation can catch a misconnected cable before the customer powers the device back on and damages it.

It may be an old saying, but Lilja’s remark that “a picture is worth a thousand words,” is a surprisingly succinct argument for multimodal, particularly when the alternative is a hallucinated instruction in a customer support chat.

The brands that recognize this will be able to elevate their service levels from merely improving their NPS to building support that customers can actually rely on – where the AI’s confidence is backed by something real.

Artificial Intelligence Chatbots Digital Customer Experience (DCX)Generative AI

Shan Lilja