Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
An overview of deep-learning-based audio-visual speech enhancement and separation
Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …
extract either one or more target speech signals, respectively, from a mixture of sounds …
Analyzing lower half facial gestures for lip reading applications: Survey on vision techniques
SJ Preethi - Computer Vision and Image Understanding, 2023 - Elsevier
Lip reading has gained popularity due to the proliferation of emerging real-world
applications. This article provides a comprehensive review of benchmark datasets available …
applications. This article provides a comprehensive review of benchmark datasets available …
Revise: Self-supervised speech resynthesis with visual input for universal and generalized speech regeneration
Prior works on improving speech quality with visual input typically study each type of
auditory distortion separately (eg, separation, inpainting, video-to-speech) and present …
auditory distortion separately (eg, separation, inpainting, video-to-speech) and present …
Interspeech 2022 audio deep packet loss concealment challenge
Audio Packet Loss Concealment (PLC) is the hiding of gaps in audio streams caused by
data transmission failures in packet switched networks. This is a common problem, and of …
data transmission failures in packet switched networks. This is a common problem, and of …
Can audio-visual integration strengthen robustness under multimodal attacks?
In this paper, we propose to make a systematic study on machines' multisensory perception
under attacks. We use the audio-visual event recognition task against multimodal …
under attacks. We use the audio-visual event recognition task against multimodal …
Speechpainter: Text-conditioned speech inpainting
We propose SpeechPainter, a model for filling in gaps of up to one second in speech
samples by leveraging an auxiliary textual input. We demonstrate that the model performs …
samples by leveraging an auxiliary textual input. We demonstrate that the model performs …
Deep prior-based audio inpainting using multi-resolution harmonic convolutional neural networks
In this manuscript, we propose a novel method to perform audio inpainting, ie, the
restoration of audio signals presenting multiple missing parts. Audio inpainting can be …
restoration of audio signals presenting multiple missing parts. Audio inpainting can be …
Diffusion-based audio inpainting
Audio inpainting aims to reconstruct missing segments in corrupted recordings. Most of
existing methods produce plausible reconstructions when the gap lengths are short, but …
existing methods produce plausible reconstructions when the gap lengths are short, but …
Audio-visual speech synthesis using vision transformer–enhanced autoencoders with ensemble of loss functions
Audio-visual speech synthesis (AVSS) has garnered attention in recent years for its utility in
the realm of audio-visual learning. AVSS transforms one speaker's speech into another's …
the realm of audio-visual learning. AVSS transforms one speaker's speech into another's …
Revise: Self-supervised speech resynthesis with visual input for universal and generalized speech enhancement
Prior works on improving speech quality with visual input typically study each type of
auditory distortion separately (eg, separation, inpainting, video-to-speech) and present …
auditory distortion separately (eg, separation, inpainting, video-to-speech) and present …