Survey on automatic lip-reading in the era of deep learning
In the last few years, there has been an increasing interest in develo** systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
[PDF][PDF] Multimodal deep learning.
Deep networks have been successfully applied to unsupervised feature learning for single
modalities (eg, text, images or audio). In this work, we propose a novel application of deep …
modalities (eg, text, images or audio). In this work, we propose a novel application of deep …
A review of recent advances in visual speech decoding
Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …
especially when audio is corrupted or even inaccessible. Despite the success of audio …
Multimodal learning with deep boltzmann machines
Abstract We propose a Deep Boltzmann Machine for learning a generative model of
multimodal data. We show how to use the model to extract a meaningful representation of …
multimodal data. We show how to use the model to extract a meaningful representation of …
Deep multimodal learning for audio-visual speech recognition
In this paper, we present methods in deep multimodal learning for fusing speech and visual
modalities for Audio-Visual Automatic Speech Recognition (AV-ASR). First, we study an …
modalities for Audio-Visual Automatic Speech Recognition (AV-ASR). First, we study an …
Ouluvs2: A multi-view audiovisual database for non-rigid mouth motion analysis
Visual speech constitutes a large part of our nonrigid facial motion and contains important
information that allows machines to interact with human users, for instance, through …
information that allows machines to interact with human users, for instance, through …
Multi-grained spatio-temporal features perceived network for event-based lip-reading
Automatic lip-reading (ALR) aims to recognize words using visual information from the
speaker's lip movements. In this work, we introduce a novel type of sensing device, event …
speaker's lip movements. In this work, we introduce a novel type of sensing device, event …
Deep learning-based automated lip-reading: A survey
A survey on automated lip-reading approaches is presented in this paper with the main
focus being on deep learning related methodologies which have proven to be more fruitful …
focus being on deep learning related methodologies which have proven to be more fruitful …
Lip reading sentences using deep learning with only visual cues
In this paper, a neural network-based lip reading system is proposed. The system is lexicon-
free and uses purely visual cues. With only a limited number of visemes as classes to …
free and uses purely visual cues. With only a limited number of visemes as classes to …
End-to-end neuromorphic lip-reading
Human speech perception is intrinsically a multi-modal task since speech production
requires the speaker to move the lips, producing visual cues in addition to auditory …
requires the speaker to move the lips, producing visual cues in addition to auditory …