Recent advances in the automatic recognition of audiovisual speech

G Potamianos, C Neti, G Gravier, A Garg… - Proceedings of the …, 2003 - ieeexplore.ieee.org
Visual speech information from the speaker's mouth region has been successfully shown to
improve noise robustness of automatic speech recognizers, thus promising to extend their …

An overview of speaker identification: Accuracy and robustness issues

R Togneri, D Pullella - IEEE circuits and systems magazine, 2011 - ieeexplore.ieee.org
This paper presents the main paradigms for speaker identification, and recent work on
missing data methods to increase robustness. The feature extraction, speaker modeling and …

Benchmarking neural network robustness to common corruptions and perturbations

D Hendrycks, T Dietterich - arxiv preprint arxiv:1903.12261, 2019 - arxiv.org
In this paper we establish rigorous benchmarks for image classifier robustness. Our first
benchmark, ImageNet-C, standardizes and expands the corruption robustness topic, while …

Benchmarking neural network robustness to common corruptions and surface variations

D Hendrycks, TG Dietterich - arxiv preprint arxiv:1807.01697, 2018 - arxiv.org
In this paper we establish rigorous benchmarks for image classifier robustness. Our first
benchmark, ImageNet-C, standardizes and expands the corruption robustness topic, while …

[PDF][PDF] Audio-visual automatic speech recognition: An overview

G Potamianos, C Neti, J Luettin… - Issues in visual and audio …, 2004 - academia.edu
We have made significant progress in automatic speech recognition (ASR) for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

Transfer linear subspace learning for cross-corpus speech emotion recognition

P Song - IEEE transactions on affective computing, 2017 - ieeexplore.ieee.org
Speech emotion recognition has received an increasing interest in recent years, which is
often conducted on the assumption that speech utterances in training and testing datasets …

Kaleidoscope: An efficient, learnable representation for all structured linear maps

T Dao, NS Sohoni, A Gu, M Eichhorn, A Blonder… - arxiv preprint arxiv …, 2020 - arxiv.org
Modern neural network architectures use structured linear transformations, such as low-rank
matrices, sparse matrices, permutations, and the Fourier transform, to improve inference …

[PDF][PDF] Audio visual speech recognition

C Neti, G Potamianos, J Luettin, I Matthews, H Glotin… - 2000 - infoscience.epfl.ch
We have made significant progress in automatic speech recognition ASR for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

[PDF][PDF] 1993 benchmark tests for the ARPA spoken language program

DS Pallett, JG Fiscus, WM Fisher… - … : Proceedings of a …, 1994 - aclanthology.org
This paper reports results obtained in benchmark tests conducted within the ARPA Spoken
Language program in November and December of 1993. In addition to ARPA contractors …

Cardiac anomaly detection considering an additive noise and convolutional distortion model of heart sound recordings

FB Azam, MI Ansari, SISK Nuhash, I McLane… - Artificial Intelligence in …, 2022 - Elsevier
Cardiac auscultation is an essential point-of-care method used for the early diagnosis of
heart diseases. Automatic analysis of heart sounds for abnormality detection is faced with …