Survey on automatic lip-reading in the era of deep learning

A Fernandez-Lopez, FM Sukno - Image and Vision Computing, 2018 - Elsevier
In the last few years, there has been an increasing interest in develo** systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …

[PDF][PDF] Multimodal deep learning.

J Ngiam, A Khosla, M Kim, J Nam, H Lee, AY Ng - ICML, 2011 - academia.edu
Deep networks have been successfully applied to unsupervised feature learning for single
modalities (eg, text, images or audio). In this work, we propose a novel application of deep …

A review of recent advances in visual speech decoding

Z Zhou, G Zhao, X Hong, M Pietikäinen - Image and vision computing, 2014 - Elsevier
Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …

Multimodal learning with deep boltzmann machines

N Srivastava, RR Salakhutdinov - Advances in neural …, 2012 - proceedings.neurips.cc
Abstract We propose a Deep Boltzmann Machine for learning a generative model of
multimodal data. We show how to use the model to extract a meaningful representation of …

Deep multimodal learning for audio-visual speech recognition

Y Mroueh, E Marcheret, V Goel - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
In this paper, we present methods in deep multimodal learning for fusing speech and visual
modalities for Audio-Visual Automatic Speech Recognition (AV-ASR). First, we study an …

Ouluvs2: A multi-view audiovisual database for non-rigid mouth motion analysis

I Anina, Z Zhou, G Zhao… - 2015 11th IEEE …, 2015 - ieeexplore.ieee.org
Visual speech constitutes a large part of our nonrigid facial motion and contains important
information that allows machines to interact with human users, for instance, through …

Multi-grained spatio-temporal features perceived network for event-based lip-reading

G Tan, Y Wang, H Han, Y Cao… - Proceedings of the …, 2022 - openaccess.thecvf.com
Automatic lip-reading (ALR) aims to recognize words using visual information from the
speaker's lip movements. In this work, we introduce a novel type of sensing device, event …

Deep learning-based automated lip-reading: A survey

S Fenghour, D Chen, K Guo, B Li, P **ao - IEEE Access, 2021 - ieeexplore.ieee.org
A survey on automated lip-reading approaches is presented in this paper with the main
focus being on deep learning related methodologies which have proven to be more fruitful …

Lip reading sentences using deep learning with only visual cues

S Fenghour, D Chen, K Guo, P **ao - IEEE Access, 2020 - ieeexplore.ieee.org
In this paper, a neural network-based lip reading system is proposed. The system is lexicon-
free and uses purely visual cues. With only a limited number of visemes as classes to …

End-to-end neuromorphic lip-reading

H Bulzomi, M Schweiker, A Gruel… - Proceedings of the …, 2023 - openaccess.thecvf.com
Human speech perception is intrinsically a multi-modal task since speech production
requires the speaker to move the lips, producing visual cues in addition to auditory …