[HTML][HTML] Frontier research on low-resource speech recognition technology

W Slam, Y Li, N Urouvas - Sensors, 2023 - mdpi.com
With the development of continuous speech recognition technology, users have put forward
higher requirements in terms of speech recognition accuracy. Low-resource speech …

The subspace Gaussian mixture model—A structured model for speech recognition

D Povey, L Burget, M Agarwal, P Akyazi, F Kai… - Computer Speech & …, 2011 - Elsevier
We describe a new approach to speech recognition, in which all Hidden Markov Model
(HMM) states share the same Gaussian Mixture Model (GMM) structure with the same …

Subspace Gaussian mixture models for speech recognition

D Povey, L Burget, M Agarwal, P Akyazi… - … , Speech and Signal …, 2010 - ieeexplore.ieee.org
We describe an acoustic modeling approach in which all phonetic states share a common
Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of …

Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models

L Burget, P Schwarz, M Agarwal… - … on acoustics, speech …, 2010 - ieeexplore.ieee.org
Although research has previously been done on multilingual speech recognition, it has been
found to be very difficult to improve over separately trained systems. The usual approach …

A basis representation of constrained MLLR transforms for robust adaptation

D Povey, K Yao - Computer Speech & Language, 2012 - Elsevier
Abstract Constrained Maximum Likelihood Linear Regression (CMLLR) is a speaker
adaptation method for speech recognition that can be realized as a feature-space …

Approaches to automatic lexicon learning with limited training examples

N Goel, S Thomas, M Agarwal, P Akyazi… - … , Speech and Signal …, 2010 - ieeexplore.ieee.org
Preparation of a lexicon for speech recognition systems can be a significant effort in
languages where the written form is not exactly phonetic. On the other hand, in languages …

Subspace Gaussian mixture based language modeling for large vocabulary continuous speech recognition

RH Sun, RJ Chol - Speech Communication, 2020 - Elsevier
This paper focuses on adaptable continuous space language modeling approach of
combining longer context information of recurrent neural network (RNN) with adaptation …

Strong Label Generation for Preparing Speech Data in Military Applications Using CTC Loss

F Gökgöz, A Cornaggia-Urrigshardt… - 2024 International …, 2024 - ieeexplore.ieee.org
For most military speech datasets, only “weak” textual annotations are available meaning
that the content of the speech is known but the exact on-and offset of each word is not. This …

Dithering techniques in automatic recognition of speech corrupted by MP3 compression: Analysis, solutions and experiments

M Borsky, P Mizera, P Pollak, J Nouza - Speech Communication, 2017 - Elsevier
A large portion of the audio files distributed over the Internet or those stored in personal and
corporate media archives are in a compressed form. There exist several compression …

A basis method for robust estimation of constrained MLLR

D Povey, K Yao - … Conference on Acoustics, Speech and Signal …, 2011 - ieeexplore.ieee.org
Constrained Maximum Likelihood Linear Regression (CMLLR) is a widely used speaker
adaptation technique in which an affine transform of the features is estimated for each …