Multimodal audiovisual speech recognition architecture using a three‐feature multi‐fusion method for noise‐robust systems
Exposure to varied noisy environments impairs the recognition performance of artificial
intelligence‐based speech recognition technologies. Degraded‐performance services can …
intelligence‐based speech recognition technologies. Degraded‐performance services can …
Event-Triggered Fixed-Time Sliding Mode Control for Lip-Reading-Driven UAV: Disturbance Rejection Using Wind Field Optimization
This paper investigates the fixed-time sliding mode control (FTSMC) problem for a
quadcopter unmanned aerial vehicle (QUAV), which is driven by a lip-reading recognition …
quadcopter unmanned aerial vehicle (QUAV), which is driven by a lip-reading recognition …
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Recent advances in Audio-Visual Speech Recognition (AVSR) have led to unprecedented
achievements in the field, improving the robustness of this type of system in adverse, noisy …
achievements in the field, improving the robustness of this type of system in adverse, noisy …
Continuous lipreading based on acoustic temporal alignments
Visual speech recognition (VSR) is a challenging task that has received increasing interest
during the last few decades. Current state of the art employs powerful end-to-end …
during the last few decades. Current state of the art employs powerful end-to-end …
Comparing speaker adaptation methods for visual speech recognition for continuous spanish
Visual speech recognition (VSR) is a challenging task that aims to interpret speech based
solely on lip movements. However, although remarkable results have recently been reached …
solely on lip movements. However, although remarkable results have recently been reached …
Evaluation of end-to-end continuous spanish lipreading in different data conditions
Visual speech recognition remains an open research problem where different challenges
must be considered by dispensing with the auditory sense, such as visual ambiguities, the …
must be considered by dispensing with the auditory sense, such as visual ambiguities, the …
IR-UWB radar-based contactless silent speech recognition of vowels, consonants, words, and phrases
Several sensing techniques have been proposed for silent speech recognition (SSR);
however, many of these methods require invasive processes or sensor attachment to the …
however, many of these methods require invasive processes or sensor attachment to the …
Arabic Lip Reading with Limited Data Using Deep Learning
Z Jabr, S Etemadi, N Mozayani - IEEE Access, 2024 - ieeexplore.ieee.org
Two main challenges faced by deep learning systems are related to the amount of data and
the complexity of the model concerning the number and type of layers and the number of …
the complexity of the model concerning the number and type of layers and the number of …
Speaker-Adapted End-to-End Visual Speech Recognition for Continuous Spanish
Different studies have shown the importance of visual cues throughout the speech
perception process. In fact, the development of audiovisual approaches has led to advances …
perception process. In fact, the development of audiovisual approaches has led to advances …
[PDF][PDF] Extending LIP-RTVE: Towards A Large-Scale Audio-Visual Dataset for Continuous Spanish in the Wild
M Zaragozá-Portolés, D Gimeno-Gómez… - Proc. IberSPEECH …, 2024 - isca-archive.org
This article presents the extension of the LIP-RTVE dataset, a dataset dedicated to the
Spanish language for advancing audiovisual speech technologies. The annotated corpus …
Spanish language for advancing audiovisual speech technologies. The annotated corpus …