Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language

JH Yeo, CW Kim, H Kim, H Rha, S Han… - arxiv preprint arxiv …, 2024 - arxiv.org
Lip reading aims to predict spoken language by analyzing lip movements. Despite
advancements in lip reading technologies, performance degrades when models are applied …

mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition

A Rouditchenko, S Bhati, S Thomas, H Kuehne… - arxiv preprint arxiv …, 2025 - arxiv.org
Audio-Visual Speech Recognition (AVSR) combines lip-based video with audio and can
improve performance in noise, but most methods are trained only on English data. One …