Banking has now crossed a line where experimenting with AI is now riskier than choosing to commit to it.
Whilst ‘staying safe’ in the early days of experimentation was valuable to ensure future advantages, many institutions that are still running proofs of concept are no longer staying safe; they are instead accumulating CX debt while their competitors turn AI into measurable capacity.
Glia’s 2026 Benchmark Report reveals that banks should be achieving at least 90% in intent accuracy to ensure AI is reliable enough to contain high-volume queries without eroding trust. high-volume queries without eroding trust.
Justin DiPietro, Chief Strategy Officer & Co-Founder at Glia, argues that banks that choose to keep AI stuck in pilots will fall behind competitors, with AI in production now being a necessity, not a choice.
“The problem of keeping [AI projects] in experimental mode is that you’re just going to be outcompeted by the competition,” he explained.
“The banks don’t differentiate in many places, [but] they differentiate on customer experience.
“It’s not optional anymore. [Financial leaders] have to get to production.”
When Pilots Become a Competitive Disadvantage
Whilst experimenting with AI allowed teams in the early stages to understand new capabilities before deployment, small AI pilots can no longer keep pace with how fast customer expectations and technology are shifting in banking.
A tactic that once reduced risk now slows progress in areas where competitors are moving quickly, as customers now expect seamless digital experiences, rapid problem resolution, and consistent support across channels.
Banks that choose to stay in pilot mode delay improvements that customers already view as standard, falling behind as competitors gain efficiency and strengthen satisfaction.
This shift in expectations means that banks must now be able to move from controlled pilots to broader implementation, as scaling AI not only accelerates the adoption of new capabilities but also provides real-world data to identify measurable business outcomes.
Without this shift, experimentation alone cannot keep pace with evolving expectations or demonstrate a tangible competitive advantage.
Dan Michaeli, CEO & Co-Founder at Glia, explained that because AI capabilities are improving rapidly, staying in experimentation mode puts a company at a competitive disadvantage.
“At the end of the day, it’s the speed at which these capabilities are evolving and getting better,” he said.
“That’s why you can’t afford to be in experimentation mode anymore. You’re just at a fundamental disadvantage.”
Why Underperforming AI Breaks Banking CX
Many banking customers have already been conditioned by previous negative automation interactions, with many choosing to skip it entirely.
With mistrust already baked into the banking experience, these prior interactions can make customers more likely to ask for a human in regulated environments.
When AI underperforms in banking, wrong answers and solutions can lead to customers having to repeat themselves, escalate issues, longer handling times, and duplicate contacts to services. For a bank, this can significantly impact operations, with more queue pressure on the existing human agents and result in greater costs for the company.
However, when trust is damaged in these instances, customers can start to assume that its digital services cannot reliably help them, which can drag down satisfaction and increase human dependency.
Banking customers are now expecting precision, meaning balances, payment status, authentication steps, and policy rules aren’t “good enough” interactions anymore.
When automation gets it wrong, the downside isn’t just friction; it’s reputational and compliance risk in a high-trust environment.
Banks that settle for “good enough” AI risk greater CX and compliance fallout and will likely struggle to build the trust and accuracy needed to scale.
“I don’t believe that in banking or regulated industries, ‘good enough’ [is acceptable].” DiPietro continued.
“You cannot solve something you don’t understand, just like you cannot optimize something you can’t measure. If you don’t understand, you cannot bring it to a resolution.”
The path out of “good enough” banking now requires clarity, meaning banks will need to know what customers are trying to do, how often the AI correctly understands, and where failures occur.
The 90%+ Threshold – the Capabilities That Make AI Production-Ready
Today, production-ready AI in banking is no longer just a vague aspiration, its measurable.
According to Glia’s Benchmark report, 90%+ intent understanding should be the minimum level where AI becomes operationally safe and valuable in banking CX.
This benchmark is achievable on high-volume intents, with 94.81% for balance inquiries, 91.3% for direct deposit setup, and 90.7% for move money.
When AI is trained for banking, it should be able to correctly handle a routine request almost every time, as even a small miss-rate creates a disproportionate amount of friction, moving AI from a fragile pilot to a scalable system.
Getting to 90%+ isn’t about switching on a generic model, it requires domain-trained intent understanding, orchestration with safe escalation, and governance that prevents hallucinations.
For CX leaders, the question isn’t whether you have a pilot, it’s whether you can prove the AI understands customers consistently and fails safely when it doesn’t.
“This is such a challenge within this particular industry,” Michaeli continued.
“You’re applying something that is probabilistic to an industry that is used to being highly deterministic.”