AI chipmaking giant NVIDIA has invested in UK-based unicorn ElevenLabs, a specialist in AI-generated speech and audio technology.
The move underscores the growing demand for human-like voice AI in customer interactions.
Moreover, it also signals NVIDIA’s growing interest in the generative AI ecosystem, particularly in applications that transform how brands interact with customers.
The deal was confirmed by ElevenLabs’ Co-Founder and CEO Mati Staniszewski in a video, but the financial details were not disclosed.
Founded in 2022, ElevenLabs has gained attention for its advanced AI voice synthesis platform, which can create hyper-realistic speech in multiple languages, accents, and emotional tones.
The technology combines deep learning with proprietary voice cloning and dubbing technologies to enable businesses to deliver real-time personalized, natural-sounding interactions and accessibility tools across voice channels.
ElevenLabs last week announced the continued expansion of its UK and US operations. The company trains its text-to-speech and speech-to-text AI models on systems powered by NVIDIA’s Blackwell graphics processing units (GPUs) and accelerated software, and collaborates with the company on developing new technologies.
Jensen Huang, NVIDIA’s CEO, used ElevenLabs’ AI speech and voice cloning technology to narrate several chapters of his keynote speech at Computex last year in both English and Mandarin.
He created the voice in under an hour with just seven minutes of recorded audio, according to the company.
“Whenever my voice is delivered digitally using artificial intelligence, it’s the ElevenLabs platform that I’m using,” Huang said in the video. “Speech to text is just technology. Text to speech is artistry; that craft that goes into a product that gets integrated with technology.”
Staniszewski said in the video that he expects AI to pass the Turing Test for conversation soon, which would allow chatbots and agents to engage with customers in a way that is indistinguishable from humans in terms of flow, tone, and understanding.
We think over the next year or two we’ll see the Turing Test passed for conversation in most settings, whether it’s immersive gaming, personal agents, calling customer experience — all of that will be so elevated with incredible emotion.
Huang added:
“With the delivery of the emotion, you’re also delivering empathy and when you’re delivering empathy, you’re delivering connection. And so all of that ability to capture that in artificial intelligence is quite incredible.”
For customer experience leaders, this opens up new possibilities in automating service interactions and scaling multilingual support.
Better Voice, Better Experience
From virtual assistants and contact center agents to personalized voiceovers in digital content, brands can use ElevenLabs’ technology to deploy AI-generated voices that are indistinguishable from human agents.
That can help reduce wait times while maintaining conversational quality, which is key for building customer trust and loyalty.
Companies like Microsoft are also recognizing the importance of providing more natural customer interactions with voice AI.
The tech giant has added features like HD voice and constrained speech recognition to its Dynamics 365 Contact Center platform to help businesses reduce customer frustration in dealing with AI agents and provide seamless, personalized interactions on self-service channels.
This growing focus on voice quality and natural interaction goes hand in hand with a broader push toward accessibility in customer service, ensuring that every customer, regardless of ability or language, can engage with brands equally.
Retail giant Amazon recently expanded its accessibility efforts, introducing a new customer service option in French Sign Language to enable deaf and hard-of-hearing customers to connect with its customer service team through a video call.
ElevenLabs’ accessibility technology is designed to help businesses improve their customer service and experience in this way, by making spoken content more inclusive.
By embedding natural-sounding, adaptive voice AI into their CX strategies, companies can break down accessibility barriers related to vision, language, cognitive processing, and neurodiversity to meet the needs of all customers.