Recent advances in the automatic recognition of audiovisual speech

G Potamianos, C Neti, G Gravier, A Garg… - Proceedings of the …, 2003 - ieeexplore.ieee.org
Visual speech information from the speaker's mouth region has been successfully shown to
improve noise robustness of automatic speech recognizers, thus promising to extend their …

[PDF][PDF] Audio-visual automatic speech recognition: An overview

G Potamianos, C Neti, J Luettin… - Issues in visual and audio …, 2004 - academia.edu
We have made significant progress in automatic speech recognition (ASR) for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

Dynamic Bayesian networks for audio-visual speech recognition

AV Nefian, L Liang, X Pi, X Liu, K Murphy - EURASIP Journal on …, 2002 - Springer
The use of visual features in audio-visual speech recognition (AVSR) is justified by both the
speech generation mechanism, which is essentially bimodal in audio and visual …

A coupled HMM for audio-visual speech recognition

AV Nefian, L Liang, X Pi, L **aoxiang… - … , Speech, and Signal …, 2002 - ieeexplore.ieee.org
In recent years several speech recognition systems that use visual together with audio
information showed significant increase in performance over the standard speech …

Modular intelligent transportation system

PJ Lagassey - US Patent 9,371,099, 2016 - Google Patents
4,677,845 4,678,329 4,678,792 4,678,793 4,678,814 4,679,137 4,679,147 4,680,715
4,680,787 4,680,835 4,681,576 4,866,770 4,884,091 5,045,940 5,179,383 5,218,435 …

Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition

G Papandreou, A Katsamanis… - … on Audio, Speech …, 2009 - ieeexplore.ieee.org
While the accuracy of feature measurements heavily depends on changing environmental
conditions, studying the consequences of this fact in pattern recognition tasks has received …

Gating neural network for large vocabulary audiovisual speech recognition

F Tao, C Busso - IEEE/ACM Transactions on Audio, Speech …, 2018 - ieeexplore.ieee.org
Audio-based automatic speech recognition (A-ASR) systems are affected by noisy
conditions in real-world applications. Adding visual cues to the ASR system is an appealing …

Audio-visual emotion recognition using gaussian mixture models for face and voice

A Metallinou, S Lee… - 2008 Tenth IEEE …, 2008 - ieeexplore.ieee.org
Emotion expression associated with human communication is known to be a multimodal
process. In this work, we investigate the way that emotional information is conveyed by facial …

Robust audio-visual speech recognition under noisy audio-video conditions

D Stewart, R Seymour, A Pass… - IEEE transactions on …, 2013 - ieeexplore.ieee.org
This paper presents the maximum weighted stream posterior (MWSP) model as a robust and
efficient stream integration method for audio-visual speech recognition in environments …

Learning dynamic stream weights for coupled-HMM-based audio-visual speech recognition

AH Abdelaziz, S Zeiler… - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org
With the increasing use of multimedia data in communication technologies, the idea of
employing visual information in automatic speech recognition (ASR) has recently gathered …