Lessons from building acoustic models with a million hours of speech

SHK Parthasarathi, N Strom - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
This is a report of our lessons learned building acoustic models from 1 Million hours of
unlabeled speech, while labeled speech is restricted to 7,000 hours. We employ …

Deep neural network features and semi-supervised training for low resource speech recognition

S Thomas, ML Seltzer, K Church… - … on acoustics, speech …, 2013 - ieeexplore.ieee.org
We propose a new technique for training deep neural networks (DNNs) as data-driven
feature front-ends for large vocabulary continuous speech recognition (LVCSR) in low …

Unsupervised language model adaptation

M Bacchiani, B Roark - 2003 IEEE International Conference on …, 2003 - ieeexplore.ieee.org
This paper investigates unsupervised language model adaptation, from ASR transcripts. N-
gram counts from these transcripts can be used either to adapt an existing n-gram model or …

[PDF][PDF] A comparison of the data requirements of automatic speech recognition systems and human listeners.

RK Moore - INTERSPEECH, 2003 - academia.edu
Since the introduction of hidden Markov modelling there has been an increasing emphasis
on data-driven approaches to automatic speech recognition. This derives from the fact that …

[PDF][PDF] Supervised and unsupervised PCFG adaptation to novel domains

B Roark, M Bacchiani - … of the 2003 Human Language Technology …, 2003 - aclanthology.org
This paper investigates adapting a lexicalized probabilistic context-free grammar (PCFG) to
a novel domain, using maximum a posteriori (MAP) estimation. The MAP framework is …

Unsupervised training and directed manual transcription for LVCSR

K Yu, M Gales, L Wang, PC Woodland - Speech Communication, 2010 - Elsevier
A significant cost in obtaining acoustic training data is the generation of accurate
transcriptions. When no transcription is available, unsupervised training techniques must be …

[PDF][PDF] Semi-supervised gmm and dnn acoustic model training with multi-system combination and confidence re-calibration.

Y Huang, D Yu, Y Gong, C Liu - Interspeech, 2013 - microsoft.com
We present our study on semi-supervised Gaussian mixture model (GMM) hidden Markov
model (HMM) and deep neural network (DNN) HMM acoustic model training. We analyze …

Weak top-down constraints for unsupervised acoustic model training

A Jansen, S Thomas… - 2013 IEEE International …, 2013 - ieeexplore.ieee.org
Typical supervised acoustic model training relies on strong top-down constraints provided
by dynamic programming alignment of the input observations to phonetic sequences …

MAP adaptation of stochastic grammars

M Bacchiani, M Riley, B Roark, R Sproat - Computer speech & language, 2006 - Elsevier
This paper investigates supervised and unsupervised adaptation of stochastic grammars,
including n-gram language models and probabilistic context-free grammars (PCFGs), to a …

Deciphering speech: a zero-resource approach to cross-lingual transfer in asr

O Klejch, E Wallington, P Bell - arxiv preprint arxiv:2111.06799, 2021 - arxiv.org
We present a method for cross-lingual training an ASR system using absolutely no
transcribed training data from the target language, and with no phonetic knowledge of the …