An analysis of environment, microphone and data simulation mismatches in robust speech recognition

E Vincent, S Watanabe, AA Nugraha, J Barker… - Computer Speech & …, 2017 - Elsevier
Speech enhancement and automatic speech recognition (ASR) are most often evaluated in
matched (or multi-condition) settings where the acoustic conditions of the training data …

A state-of-the-art survey on noise removal in a non-stationary signal using adaptive finite impulse response filtering: challenges, techniques, and applications

NK Yadav, A Dhawan, M Tiwari… - International Journal of …, 2025 - Taylor & Francis
An adaptive finite impulse response (FIR) filter is a key technique to remove noise in non-
stationary signals. With the rapid development of the various adaptive algorithms, it is both …

Uncertain LDA: Including observation uncertainties in discriminative transforms

R Saeidi, RF Astudillo, D Kolossa - IEEE transactions on pattern …, 2015 - ieeexplore.ieee.org
Linear discriminant analysis (LDA) is a powerful technique in pattern recognition to reduce
the dimensionality of data vectors. It maximizes discriminability by retaining only those …

[PDF][PDF] Group sparsity for speaker identity discrimination in factorisation-based speech recognition

A Hurmalainen, R Saeidi, T Virtanen - 2012 - repository.ubn.ru.nl
Spectrogram factorisation using a dictionary of spectrotemporal atoms has been
successfully employed to separate a mixed audio signal into its source components. When …

Detection, separation and recognition of speech from continuous signals using spectral factorisation

A Hurmalainen, JF Gemmeke… - 2012 Proceedings of the …, 2012 - ieeexplore.ieee.org
In real world speech processing, the signals are often continuous and consist of momentary
segments of speech over non-stationary background noise. It has been demonstrated that …

Multichannel audio separation by direction of arrival based spatial covariance model and non-negative matrix factorization

J Nikunen, T Virtanen - 2014 IEEE International Conference on …, 2014 - ieeexplore.ieee.org
This paper studies multichannel audio separation using non-negative matrix factorization
(NMF) combined with a new model for spatial covariance matrices (SCM). The proposed …

Variational Bayesian inference for source separation and robust feature extraction

K Adiloğlu, E Vincent - IEEE/ACM Transactions on Audio …, 2016 - ieeexplore.ieee.org
We consider the task of separating and classifying individual sound sources mixed together.
The main challenge is to achieve robust classification despite residual distortion of the …

[PDF][PDF] Noise robust speaker recognition with convolutive sparse coding.

A Hurmalainen, R Saeidi, T Virtanen - Interspeech, 2015 - isca-archive.org
Recognition and classification of speech content in everyday environments is challenging
due to the large diversity of realworld noise sources, which may also include competing …

[PDF][PDF] The TUM+ TUT+ KUL approach to the 2nd CHiME challenge: Multi-stream ASR exploiting BLSTM networks and sparse NMF

JT Geiger, F Weninger, A Hurmalainen… - Proc. 2nd CHiME …, 2013 - mediatum.ub.tum.de
We present our joint contribution to the 2nd CHiME Speech Separation and Recognition
Challenge. Our system combines speech enhancement by supervised sparse non-negative …

Noise robust exemplar matching using sparse representations of speech

E Yılmaz, JF Gemmeke - IEEE/ACM transactions on audio …, 2014 - ieeexplore.ieee.org
Performing automatic speech recognition using exemplars (templates) holds the promise to
provide a better duration and coarticulation modeling compared to conventional approaches …