Speechmatics Announces Autonomous Speech Recognition

Speechmatics' technology has been perfected on huge amounts of unlabelled data directly from the internet

Speech Analytics Latest News

Published: October 27, 2021

Sandra Radlovački

Speechmatics, a speech recognition technology provider, launched its Autonomous Speech Recognition software.

Speechmatics’ technology has been perfected using huge amounts of unlabelled data directly from the internet, such as social media content and podcasts. By using self-supervised learning, the technology is now trained on 1.1 million hours of audio for a far more comprehensive representation of all voices that also dramatically reduces AI bias and errors in speech recognition.

Katy Wigdahl, CEO of Speechmatics, said: “We are on a mission to deliver the next generation of machine learning capabilities, and through that offer more inclusive and accessible speech technology. This announcement today is a huge step toward achieving that mission.”

“Our focus in tackling AI bias has led to this monumental leap forward in the speech recognition industry and the ripple effect will lead to changes in a multitude of different scenarios. Think of the incorrect captions we see on social media, court hearings where words are mistranscribed, and eLearning platforms that have struggled with children’s voices throughout the pandemic. Errors people have had to accept until now can have a tangible impact on their daily lives.”

Using the latest techniques in deep learning and new self-supervised models, based on datasets used in Stanford’s Racial Disparities in Speech Recognition study, Speechmatics recorded an overall accuracy of 82.8 percent for African American voices compared to Google (68.7 percent) and Amazon (68.6 percent). This level of accuracy equates to a 45 percent reduction in speech recognition errors — the equivalent of three words in an average sentence. Speechmatic’ Autonomous Speech Recognition delivers similar improvements in accuracy across accents, dialects, age, and other sociodemographic characteristics.

Speechmatics also outperforms competitors on children’s voices, having recorded 91.8 percent accuracy compared to Google (83.4 percent) and Deepgram (82.3 percent) based on the open-source project Common Voice