Consistent and relevant: Rethink the query embedding in general sound separation
The query-based audio separation usually employs specific queries to extract target sources
from a mixture of audio signals. Currently, most query-based separation models need …
from a mixture of audio signals. Currently, most query-based separation models need …
Dynamic Gated Recurrent Neural Network for Compute-efficient Speech Enhancement
This paper introduces a new Dynamic Gated Recurrent Neural Network (DG-RNN) for
compute-efficient speech enhancement models running on resource-constrained hardware …
compute-efficient speech enhancement models running on resource-constrained hardware …
Complexity Scaling for Speech Denoising
Computational complexity is critical when deploying deep learning-based speech denoising
models for on-device applications. Most prior research focused on optimizing model …
models for on-device applications. Most prior research focused on optimizing model …
SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression
The proliferation of deep neural networks has spawned the rapid development of acoustic
echo cancellation and noise suppression, and plenty of prior arts have been proposed …
echo cancellation and noise suppression, and plenty of prior arts have been proposed …
Insights from Hyperparameter Scaling of Online Speech Separation
With the rapid development of deep learning, a large number of models with excellent
performance for speech separation tasks have emerged in the literature. Despite their …
performance for speech separation tasks have emerged in the literature. Despite their …
[PDF][PDF] Low Complexity Echo Delay Estimator Based on Binarized Feature Matching
Y Gao, X Su - Proc. Interspeech 2024, 2024 - isca-archive.org
Echo delay estimation (EDE) serves as a preprocessing component within an acoustic echo
canceller (AEC). Despite some progress over the past few decades, there is a dearth of …
canceller (AEC). Despite some progress over the past few decades, there is a dearth of …
UniAudio: Towards Universal Audio Generation with Large Language Models
Audio generation is a major branch of generative AI research. Compared with prior works in
this area that are commonly task-specific with heavy domain knowledge, this paper …
this area that are commonly task-specific with heavy domain knowledge, this paper …
Ultra-Low Complexity Residue Echo and Noise Suppression Based on Recurrent Neural Network
J Zhou, Y Gao, S Zhang - National Conference on Man-Machine Speech …, 2023 - Springer
Deep learning residue echo suppression (RES) exhibits superior performance compared
with traditional methods in recent years. However, a low-resource system or preemptive …
with traditional methods in recent years. However, a low-resource system or preemptive …