On the issues of intra-speaker variability and realism in speech, speaker, and language recognition tasks

JHL Hansen, H Bořil - Speech Communication, 2018 - Elsevier
Recent years have witnessed notable advancements in the areas of speech, speaker and
language/dialect recognition. However, many of the emerging scientific principles appear to …

The automatic speech recogition in reverberant environments (ASpIRE) challenge

M Harper - 2015 IEEE Workshop on Automatic Speech …, 2015 - ieeexplore.ieee.org
In this paper, we describe the ASpIRE (Automatic Speech recognition In Reverberant
Environments) challenge, which asked participants to construct automatic speech …

Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning

R Giri, ML Seltzer, J Droppo… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
In this paper, we propose two approaches to improve deep neural network (DNN) acoustic
models for speech recognition in reverberant environments. Both methods utilize auxiliary …

Adaptation of deep neural network acoustic models for robust automatic speech recognition

KC Sim, Y Qian, G Mantena, L Samarakoon… - New Era for Robust …, 2017 - Springer
Deep neural networks (DNNs) have been successfully applied to many pattern classification
problems, including acoustic modelling for automatic speech recognition (ASR). However …

GCC-PHAT with speech-oriented attention for robotic sound source localization

J Wang, X Qian, Z Pan, M Zhang… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Robotic audition is a basic sense that helps robots perceive the surroundings and interact
with humans. Sound Source Localization (SSL) is an essential module for a robotic system …

Blind spectral weighting for robust speaker identification under reverberation mismatch

SO Sadjadi, JHL Hansen - IEEE/ACM transactions on audio …, 2014 - ieeexplore.ieee.org
Room reverberation poses various deleterious effects on performance of automatic speech
systems. Speaker identification (SID) performance, in particular, degrades rapidly as …

A fundamental pitfall in blind deconvolution with sparse and shift-invariant priors

A Benichoux, E Vincent… - 2013 IEEE International …, 2013 - ieeexplore.ieee.org
We consider the problem of blind sparse deconvolution, which is common in both image and
signal processing. To counter-balance the ill-posedness of the problem, many approaches …

Joint acoustic and spectral modeling for speech dereverberation using non-negative representations

N Mohammadiha, P Smaragdis… - 2015 IEEE international …, 2015 - ieeexplore.ieee.org
This paper proposes a single-channel speech dereverberation method enhancing the
spectrum of the reverberant speech signal. The proposed method uses a non-negative …

Anti-forensics of environmental-signature-based audio splicing detection and its countermeasure via rich-features classification

H Zhao, Y Chen, R Wang… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Numerous methods for detecting audio splicing have been proposed. Environmental-
signature-based methods are considered to be the most effective forgery detection methods …

A generalized nonnegative tensor factorization approach for distant speech recognition with distributed microphones

S Mirsamadi, JHL Hansen - IEEE/ACM Transactions on Audio …, 2016 - ieeexplore.ieee.org
Automatic speech recognition (ASR) using distant (far-field) microphones is a challenging
task, in which room reverberation is one of the primary causes of performance degradation …