Visual speech recognition for multiple languages in the wild

P Ma, S Petridis, M Pantic - Nature Machine Intelligence, 2022 - nature.com
Visual speech recognition (VSR) aims to recognize the content of speech based on lip
movements, without relying on the audio stream. Advances in deep learning and the …

Analysis of facial information for healthcare applications: A survey on computer vision-based approaches

M Leo, P Carcagnì, PL Mazzeo, P Spagnolo… - Information, 2020 - mdpi.com
This paper gives an overview of the cutting-edge approaches that perform facial cue
analysis in the healthcare area. The document is not limited to global face analysis but it …

Analyzing lower half facial gestures for lip reading applications: Survey on vision techniques

SJ Preethi - Computer Vision and Image Understanding, 2023 - Elsevier
Lip reading has gained popularity due to the proliferation of emerging real-world
applications. This article provides a comprehensive review of benchmark datasets available …

Sdfr: Synthetic data for face recognition competition

HO Shahreza, C Ecabert, A George… - 2024 IEEE 18th …, 2024 - ieeexplore.ieee.org
Large-scale face recognition datasets are collected by crawling the Internet and without
individuals' consent, raising legal, ethical, and privacy concerns. With the recent advances in …

Lip-reading with densely connected temporal convolutional networks

P Ma, Y Wang, J Shen, S Petridis… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this work, we present the Densely Connected Temporal Convolutional Network (DC-TCN)
for lip-reading of isolated words. Although Temporal Convolutional Networks (TCN) have …

Face forgery detection by 3d decomposition and composition search

X Zhu, H Fei, B Zhang, T Zhang… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Detecting digital face manipulation has attracted extensive attention due to fake media's
potential risks to the public. However, recent advances have been able to reduce the forgery …

[HTML][HTML] SIFT-CNN: when convolutional neural networks meet dense SIFT descriptors for image and sequence classification

D Tsourounis, D Kastaniotis, C Theoharatos… - Journal of …, 2022 - mdpi.com
Despite the success of hand-crafted features in computer visioning for many years,
nowadays, this has been replaced by end-to-end learnable features that are extracted from …

Lip reading of words with lip segmentation and deep learning

M Miled, MAB Messaoud, A Bouzid - Multimedia Tools and Applications, 2023 - Springer
Speech perception is recognized as a multimodal task, that is, it solicits more than one
meaning. Lip reading, which superimposes visual signals to auditory signals, is useful and …

[HTML][HTML] Lip reading by alternating between spatiotemporal and spatial convolutions

D Tsourounis, D Kastaniotis, S Fotopoulos - Journal of Imaging, 2021 - mdpi.com
Lip reading (LR) is the task of predicting the speech utilizing only the visual information of
the speaker. In this work, for the first time, the benefits of alternating between spatiotemporal …

Beyond 3dmm: Learning to capture high-fidelity 3d face shape

X Zhu, C Yu, D Huang, Z Lei, H Wang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D
priori. However, previous reconstructed 3D faces suffer from degraded visual verisimilitude …