Grapheme-to-phoneme conversion using long short-term memory recurrent neural networks

K Rao, F Peng, H Sak… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
Grapheme-to-phoneme (G2P) models are key components in speech recognition and text-to-
speech systems as they describe how words are pronounced. We propose a G2P model …

Improving seq2seq tts frontends with transcribed speech audio

S Sun, K Richmond, H Tang - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org
Due to the data inefficiency and low speech quality of grapheme-based end-to-end text-to-
speech (TTS), having a separate high-performance TTS linguistic frontend is still commonly …

" Play PRBLMS" Identifying and Correcting Less Accessible Content in Voice Interfaces

A Springer, H Cramer - Proceedings of the 2018 CHI Conference on …, 2018 - dl.acm.org
Voice interfaces often struggle with specific types of named content. Domain-specific
terminology and naming may push the bounds of standard language, especially in domains …

[PDF][PDF] Learning Personalized Pronunciations for Contact Name Recognition.

A Bruguier, F Peng, F Beaufays - INTERSPEECH, 2016 - isca-archive.org
Automatic speech recognition that involves people's names is difficult because names follow
a long-tail distribution and they have no commonly accepted spelling or pronunciation. This …

[PDF][PDF] Pronunciation Learning with RNN-Transducers.

A Bruguier, D Gnanapragasam, L Johnson, K Rao… - …, 2017 - isca-archive.org
Most speech recognition systems rely on pronunciation dictionaries to provide accurate
transcriptions. Typically, some pronunciations are carved manually, but many are produced …

[PDF][PDF] Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks.

D van Esch, M Chua, K Rao - INTERSPEECH, 2016 - isca-archive.org
Word pronunciations, consisting of phoneme sequences and the associated syllabification
and stress patterns, are vital for both speech recognition and text-to-speech (TTS) systems …

System and method for eliciting open-ended natural language responses to questions to train natural language processors

SJ Rothwell, D Braga, AK Elshenawy… - US Patent …, 2017 - Google Patents
Systems and methods gathering text commands in response to a command context using a
first crowdsourced are dis cussed herein. A command context for a natural language …

System and method for validating natural language content using crowdsourced validation jobs

SJ Rothwell, D Braga, AK Elshenawy… - US Patent …, 2016 - Google Patents
8/2014 Rhoads......................... 382/255 9, 2014 AO1G 7,045 315,307 senting whether or not
the text accurately represents the natu ral language content may be received from each of …

System and method of recording utterances using unmanaged crowds for natural language processing

D Braga, SJ Rothwell, F Romani… - US Patent …, 2017 - Google Patents
2016-07-20 Assigned to VOICEBOX TECHNOLOGIES CORPORATION reassignment
VOICEBOX TECHNOLOGIES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST …