An overview of deep-learning-based audio-visual speech enhancement and separation

D Michelsanti, ZH Tan, SX Zhang, Y Xu… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …

Deep audio-visual learning: A survey

H Zhu, MD Luo, R Wang, AH Zheng, R He - International Journal of …, 2021 - Springer
Audio-visual learning, aimed at exploiting the relationship between audio and visual
modalities, has drawn considerable attention since deep learning started to be used …

Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation

Y Luo, Z Chen, T Yoshioka - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Recent studies in deep learning-based speech separation have proven the superiority of
time-domain approaches to conventional time-frequency-based methods. Unlike the time …