Av-transpeech: Audio-visual robust speech-to-speech translation

R Huang, H Liu, X Cheng, Y Ren, L Li, Z Ye… - arxiv preprint arxiv …, 2023 - arxiv.org
Direct speech-to-speech translation (S2ST) aims to convert speech from one language into
another, and has demonstrated significant progress to date. Despite the recent success …

Audio-visual speech enhancement using self-supervised learning to improve speech intelligibility in cochlear implant simulations

RL Lai, JC Hou, M Gogate, K Dashtipour… - arxiv preprint arxiv …, 2023 - arxiv.org
Individuals with hearing impairments face challenges in their ability to comprehend speech,
particularly in noisy environments. The aim of this study is to explore the effectiveness of …

Research on DCNN-U-Net speech separation method based on Audio-Visual multimodal fusion

C Lan, R Guo, L Zhang, S Wang, M Zhang - Signal, Image and Video …, 2025 - Springer
With the rapid development of computer technology, acquiring audio-visual signals in a
complex environment is not difficult. Combining the visual information to assist speech …

[Цитат][C] Utilizing Self-Supervised Embeddings for Improving Audio-Visual Speaker Diarization at EGO4D Challenge 2023

CJ Li, WZ Ren, CW Chen, E Chu, T Huang, JC Hou…