Consistent and relevant: Rethink the query embedding in general sound separation

Y Wang, H Chen, D Yang, J Yu, C Weng… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
The query-based audio separation usually employs specific queries to extract target sources
from a mixture of audio signals. Currently, most query-based separation models need …

Dynamic Gated Recurrent Neural Network for Compute-efficient Speech Enhancement

L Cheng, A Pandey, B Xu, T Delbruck… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper introduces a new Dynamic Gated Recurrent Neural Network (DG-RNN) for
compute-efficient speech enhancement models running on resource-constrained hardware …

Complexity Scaling for Speech Denoising

H Chen, J Yu, C Weng - ICASSP 2024-2024 IEEE International …, 2024 - ieeexplore.ieee.org
Computational complexity is critical when deploying deep learning-based speech denoising
models for on-device applications. Most prior research focused on optimizing model …

SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression

Z Sun, A Li, R Chen, H Zhang, M Yu, Y Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
The proliferation of deep neural networks has spawned the rapid development of acoustic
echo cancellation and noise suppression, and plenty of prior arts have been proposed …

Insights from Hyperparameter Scaling of Online Speech Separation

X Zhou, W Zhang, C Li, Y Qian - 2024 IEEE 14th International …, 2024 - ieeexplore.ieee.org
With the rapid development of deep learning, a large number of models with excellent
performance for speech separation tasks have emerged in the literature. Despite their …

[PDF][PDF] Low Complexity Echo Delay Estimator Based on Binarized Feature Matching

Y Gao, X Su - Proc. Interspeech 2024, 2024 - isca-archive.org
Echo delay estimation (EDE) serves as a preprocessing component within an acoustic echo
canceller (AEC). Despite some progress over the past few decades, there is a dearth of …

UniAudio: Towards Universal Audio Generation with Large Language Models

D Yang, J Tian, X Tan, R Huang, S Liu, H Guo… - Forty-first International … - openreview.net
Audio generation is a major branch of generative AI research. Compared with prior works in
this area that are commonly task-specific with heavy domain knowledge, this paper …

Ultra-Low Complexity Residue Echo and Noise Suppression Based on Recurrent Neural Network

J Zhou, Y Gao, S Zhang - National Conference on Man-Machine Speech …, 2023 - Springer
Deep learning residue echo suppression (RES) exhibits superior performance compared
with traditional methods in recent years. However, a low-resource system or preemptive …