Survey on automatic lip-reading in the era of deep learning

A Fernandez-Lopez, FM Sukno - Image and Vision Computing, 2018 - Elsevier
In the last few years, there has been an increasing interest in develo** systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …

A review of recent advances in visual speech decoding

Z Zhou, G Zhao, X Hong, M Pietikäinen - Image and vision computing, 2014 - Elsevier
Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …

Audio-visual speech recognition using deep learning

K Noda, Y Yamaguchi, K Nakadai, HG Okuno… - Applied intelligence, 2015 - Springer
Audio-visual speech recognition (AVSR) system is thought to be one of the most promising
solutions for reliable speech recognition, particularly when the audio is corrupted by noise …

[PDF][PDF] Deep learning of mouth shapes for sign language

O Koller, H Ney, R Bowden - Proceedings of the IEEE …, 2015 - openaccess.thecvf.com
This paper deals with robust modelling of mouth shapes in the context of sign language
recognition using deep convolutional neural networks. Sign language mouth shapes are …

[PDF][PDF] Lipreading using convolutional neural network.

K Noda, Y Yamaguchi, K Nakadai, HG Okuno… - Interspeech, 2014 - isca-archive.org
In recent automatic speech recognition studies, deep learning architecture applications for
acoustic modeling have eclipsed conventional sound features such as Mel-frequency …

Deep learning-based automated lip-reading: A survey

S Fenghour, D Chen, K Guo, B Li, P **s: the good, the bad, and the ugly
HL Bear, R Harvey - Speech Communication, 2017 - Elsevier
Visemes are the visual equivalent of phonemes. Although not precisely defined, a common
working definition of a viseme is “a set of phonemes which have identical appearance on the …

Multimodal corpus design for audio-visual speech recognition in vehicle cabin

A Kashevnik, I Lashkov, A Axyonov, D Ivanko… - IEEE …, 2021 - ieeexplore.ieee.org
This paper introduces a new methodology aimed at comfort for the driver in-the-wild
multimodal corpus creation for audio-visual speech recognition in driver monitoring systems …