Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

Light gated recurrent units for speech recognition

M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …

A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research

K Kinoshita, M Delcroix, S Gannot, EA P. Habets… - EURASIP Journal on …, 2016 - Springer
In recent years, substantial progress has been made in the field of reverberant speech
signal processing, including both single-and multichannel dereverberation techniques and …

Highway long short-term memory rnns for distant speech recognition

Y Zhang, G Chen, D Yu, K Yao… - … on acoustics, speech …, 2016 - ieeexplore.ieee.org
In this paper, we extend the deep long short-term memory (DL-STM) recurrent neural
networks by introducing gated direct connections between memory cells in adjacent layers …

Report on the 11th IWSLT evaluation campaign

M Cettolo, J Niehues, S Stüker… - Proceedings of the …, 2014 - aclanthology.org
The paper overviews the 11th evaluation campaign organized by the IWSLT workshop. The
2014 evaluation offered multiple tracks on lecture transcription and translation based on the …

Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise

T Higuchi, N Ito, T Yoshioka… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
This paper considers acoustic beamforming for noise robust automatic speech recognition
(ASR). A beamformer attenuates background noise by enhancing sound components …

Speech acoustic modeling from raw multichannel waveforms

Y Hoshen, RJ Weiss, KW Wilson - 2015 IEEE international …, 2015 - ieeexplore.ieee.org
Standard deep neural network-based acoustic models for automatic speech recognition
(ASR) rely on hand-engineered input features, typically log-mel filterbank magnitudes. In this …

The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices

T Yoshioka, N Ito, M Delcroix, A Ogawa… - … IEEE Workshop on …, 2015 - ieeexplore.ieee.org
CHiME-3 is a research community challenge organised in 2015 to evaluate speech
recognition systems for mobile multi-microphone devices used in noisy daily environments …

Convolutional neural networks for distant speech recognition

P Swietojanski, A Ghoshal… - IEEE Signal Processing …, 2014 - ieeexplore.ieee.org
We investigate convolutional neural networks (CNNs) for large vocabulary distant speech
recognition, trained using speech recorded from a single distant microphone (SDM) and …

Learning hidden unit contributions for unsupervised acoustic model adaptation

P Swietojanski, J Li, S Renals - IEEE/ACM Transactions on …, 2016 - ieeexplore.ieee.org
This work presents a broad study on the adaptation of neural network acoustic models by
means of learning hidden unit contributions (LHUC)—a method that linearly re-combines …