Sparse subspace modeling for query by example spoken term detection

D Ram, A Asaei, H Bourlard - IEEE/ACM Transactions on Audio …, 2018 - ieeexplore.ieee.org
This paper focuses on the problem of query by example spoken term detection (QbE-STD) in
zero-resource scenario. Current state-of-the-art approaches to tackle this problem rely on …

Evaluation of phone posterior probabilities for pathology detection in speech data using deep learning models

S Farazi, Y Shekofteh - International Journal of Speech Technology, 2025 - Springer
Voice pathology detection (VPD) aims to accurately identify voice impairments by analyzing
speech signals. This study proposes models based on deep learning (DL) for binary …

On quantifying the quality of acoustic models in hybrid DNN-HMM ASR

P Dighe, A Asaei, H Bourlard - Speech Communication, 2020 - Elsevier
We propose an information theoretic framework for quantitative assessment of acoustic
models used in hidden Markov model (HMM) based automatic speech recognition (ASR) …

Low-rank and sparse soft targets to learn better dnn acoustic models

P Dighe, A Asaei, H Bourlard - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian
mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the …

Low-rank and sparse subspace modeling of speech for DNN based acoustic modeling

P Dighe, A Asaei, H Bourlard - Speech Communication, 2019 - Elsevier
Towards the goal of improving acoustic modeling for automatic speech recognition (ASR),
this work investigates the modeling of senone subspaces in deep neural network (DNN) …

[PDF][PDF] Phonological Posterior Hashing for Query by Example Spoken Term Detection.

A Asaei, D Ram, H Bourlard - INTERSPEECH, 2018 - publications.idiap.ch
State of the art query by example spoken term detection (QbE-STD) systems in zero-
resource conditions rely on representation of speech in terms of sequences of class …

[PDF][PDF] Exploring Low-Dimensional Structures of Modulation Spectra for Robust Speech Recognition.

BC Yan, CH Shih, SH Liu, B Chen - INTERSPEECH, 2017 - isca-archive.org
Developments of noise robustness techniques are vital to the success of automatic speech
recognition (ASR) systems in face of varying sources of environmental interference. Recent …

[PDF][PDF] Exploiting eigenposteriors for semi-supervised training of dnn acoustic models with sequence discrimination

P Dighe, A Asaei, H Bourlard - Proceedings of Interspeech, 2017 - infoscience.epfl.ch
Deep neural network (DNN) acoustic models yield posterior probabilities of senone classes.
Recent studies support the existence of low-dimensional subspaces underlying senone …

Phonetic and phonological posterior search space hashing exploiting class-specific sparsity structures

A Asaei, G Luyet, M Cernak, H Bourlard - 2016 - infoscience.epfl.ch
This paper shows that exemplar-based speech processing using class-conditional posterior
probabilities admits a highly effective search strategy relying on posteriors' intrinsic sparsity …