Survey on automatic lip-reading in the era of deep learning
In the last few years, there has been an increasing interest in develo** systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
A survey of research on lipreading technology
M Hao, M Mamut, N Yadikar, A Aysa, K Ubul - IEEE Access, 2020 - ieeexplore.ieee.org
Although automatic speech recognition (ASR) technology is mature, there are still some
unsolved problems, such as how to accurately identify what the speaker is saying in a noisy …
unsolved problems, such as how to accurately identify what the speaker is saying in a noisy …
Large-scale visual speech recognition
This work presents a scalable solution to open-vocabulary visual speech recognition. To
achieve this, we constructed the largest existing visual speech recognition dataset …
achieve this, we constructed the largest existing visual speech recognition dataset …
Selective listening by synchronizing speech with lips
A speaker extraction algorithm seeks to extract the speech of a target speaker from a multi-
talker speech mixture when given a cue that represents the target speaker, such as a pre …
talker speech mixture when given a cue that represents the target speaker, such as a pre …
USEV: Universal speaker extraction with visual cue
A speaker extraction algorithm seeks to extract the target speaker's speech from a multi-
talker speech mixture. The prior studies focus mostly on speaker extraction from a highly …
talker speech mixture. The prior studies focus mostly on speaker extraction from a highly …
Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition
Prior studies on audio-visual speech recognition typically assume the visibility of speaking
lips, ignoring the fact that visual occlusion occurs in real-world videos, thus adversely …
lips, ignoring the fact that visual occlusion occurs in real-world videos, thus adversely …
SAFARI: Speech-Associated Facial Authentication for AR/VR Settings via Robust VIbration Signatures
In AR/VR devices, the voice interface, serving as one of the primary AR/VR control
mechanisms, enables users to interact naturally using speeches (voice commands) for …
mechanisms, enables users to interact naturally using speeches (voice commands) for …
Audiovisual speech perception in noise in younger and older bilinguals.
A Chauvin, S Pellerin, AF Boatswain-Jacques… - Psychology and …, 2024 - psycnet.apa.org
Speech perception in noise becomes increasingly difficult with age. Similarly, bilinguals
often have difficulty with speech perception in noise in their second language (L2) due to …
often have difficulty with speech perception in noise in their second language (L2) due to …
SpotFast networks with memory augmented lateral transformers for lipreading
P Wiriyathammabhum - International Conference on Neural Information …, 2020 - Springer
This paper presents a novel deep learning architecture for word-level lipreading. Previous
works suggest a potential for incorporating a pretrained deep 3D Convolutional Neural …
works suggest a potential for incorporating a pretrained deep 3D Convolutional Neural …
CALLip: Lipreading using contrastive and attribute learning
Lipreading, aiming at interpreting speech by watching the lip movements of the speaker, has
great significance in human communication and speech understanding. Despite having …
great significance in human communication and speech understanding. Despite having …