Twilio has added the new Realtime API from OpenAI to its Communications Platform.
The API facilitates natural speech-to-speech interactions – similar to how the Advanced Voice Mode works within ChatGPT – and comes with preset voices.
With the API, developers can pass any text or audio into GPT-4o and have the model respond via either medium or both.
In serving up this capability, Twilio hopes to allow its customers to build out automated customer experiences that blend voice, messaging, and possibly languages, too.
Such customer journeys will leverage the full capabilities of the modern smartphone alongside the next generation of large language models (LLMs).
The announcement strengthens the Twilio-OpenAI partnership, which kicked off last year, as the CPaaS leader introduced CustomerAI to embed generative AI (GenAI) across its platform.
Now, Inbal Shani, Chief Product Officer at Twilio, shared her excitement at taking the relationship to the next level and enabling developers to build new experiences.
“Integrating OpenAI’s Realtime API with Twilio’s platform enables businesses to offer more natural, real-time AI voice interactions at scale,” she said.
Businesses can use this to create voice experiences that feel more human and can reduce operational costs and drive higher customer satisfaction.
Twilio also promises that automated conversations will feel “more like real human dialog”, as the Realtime API lowers latency. It also considers conversation pacing, tone, and interruption handling to enable a better balance between speaking and listening.
These features are especially helpful for use cases such as a virtual agent for service and sales.
However, the possibilities don’t end there. Twilio customers can develop their own use cases, such as a real-time voice translation tool. Consider how this could work in a vertical like the public sector. Constituents and staff that speak other languages can engage in fluid conversations.
Olivier Godement, Head of Product, API at OpenAI, shared his excitement at seeing Twilio customers bring some of these opportunities to life.
“The Realtime API’s speech-to-speech capabilities are designed to address strong customer demand for conversational AI solutions,” he said.
We’re thrilled to collaborate with Twilio to deliver a world-class developer experience for building and deploying conversational AI agents.
Developers may also connect the Realtime API to the Twilio Customer Engagement Platform. That will enable them to augment virtual agents into their existing workflows.
There, businesses can record those automated calls, extract insights, and funnel data into a CRM or customer data platform (CDP), like Segment.
Indeed, if Twilio customers also have the aforementioned CustomerAI product, they may pull more insight into the Golden Profiles within Segment, which are individualized customer records.
With these profiles, developers can enrich the outputs of a virtual agent with data like previous interactions, purchases, and more. That enables personalized, automated voice experiences.
The existing Twilio AI Assistant already leverages Segment to do this, while also extracting information to bolster each customer’s Golden Profile in real time.
For instance, if a customer shares a preference, it will be saved for future use in Segment, becoming available to every department that leverages the CDP. That could include service, sales, marketing, commerce, and in-store teams.
Elsewhere, Twilio recently added RCS to its Communications. Its ability to share audio notes, combined with the speech capacity of OpenAI’s Realtime API, could pave the way for more innovative, automated messaging campaigns.