 
                               aliciamartin@google.com 
                              Google AI 
  Tuesday November 14th
  9:00 - 13:00
  Room 2 
https://forms.office.com/r/qAx2nujh21
When computers are able to recognize more diverse speech patterns, they  can help provide more resources for people who have trouble being understood by  technology or other people in their daily lives. Project Euphonia, is a  research initiative that aims to make speech recognition more accessible for  people with non-standard speech. In many cases, when someone with non-standard  speech uses any voice-activated assistant, it does not understand them. Speech  recognition models have not been trained on data that includes non-standard  speech samples, leading to lower accuracy for individuals with speech  challenges. Since the launch of our research, volunteers have contributed more  than 1,600 hours of speech samples, creating the largest known non-standard  speech dataset in the world (Jiang, P. (2022). “Euphonia Project: Automatic  speech recognition research expands to include new languages, including  Spanish”).
                              
                              Our research demonstrated the potential for personalized autonomous speech  recognition (ASR) models to help individuals with non-standard speech be better  understood by technology and other people. We found that an ASR model  fine-tuned with an individual’s voice recordings could recognize that  individual’s voice better than human transcribers (Green, J., et al. (2023).  "Automatic Speech Recognition of Disordered Speech: Personalized Models  Outperforming Human Listeners on Short Phrases." Science 382.6387: 34-36. doi:10.1126/science.abm7687).  In many circumstances, such as voice activated smart home assistants,  personalized ASR models required only 20 minutes of speech data for most  individuals (Tobin, J., et al. (2021). “Personalized Automatic Speech  Recognition Trained on Small Disordered Speech Datasets.”  doi.org/10.48550/arXiv.2110.04612). 
                              
                              Working closely with Trusted Testers in the Euphonia program, it became  clear that personalized models can be very useful, but for many users,  recording dozens or hundreds of examples can be challenging. In addition, the  personalized models did not always perform well in freeform conversation. To  address these challenges, Euphonia’s research efforts have been focusing on  speaker independent ASR (SI-ASR) to make models work better out of the box for  people with non-standard speech so that no additional training is necessary. We  demonstrated that utilizing the Euphonia’s speech corpus in model finetuning  could improve performance on non-standard speech by ~30%  (Tobin, J., et al. (2023). “Responsible AI at  Google Research: AI for Social Good”).
                              
                              These contributions have enabled Google's speech and research teams to  conduct cutting-edge machine learning research in speech recognition, including  the ability to create personalized models that understand individual people and  speech-to-speech recognition. Allowing repetition of words in a clear  synthesized voice. This research also helped us launch Project Relate, an  Android app currently in beta, which allows people to access a personalized  model that helps make communication more accessible.
                              
                            We have now expanded our efforts to more languages, including Spanish.  This data collection will help Google build more inclusive speech recognition  models, including for Spanish speakers. 
Researchers and LATAM accessibility groups.
No pre-requisites required.
None required.