Insights into deep non-linear filters for improved multi-channel speech enhancement

K Tesch, T Gerkmann - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
The key advantage of using multiple microphones for speech enhancement is that spatial
filtering can be used to complement the tempo-spectral processing. In a traditional setting …

Neural spectrospatial filtering

K Tan, ZQ Wang, DL Wang - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
As the most widely-used spatial filtering approach for multi-channel speech separation,
beamforming extracts the target speech signal arriving from a specific direction. An …

Embedding and beamforming: All-neural causal beamformer for multichannel speech enhancement

A Li, W Liu, C Zheng, X Li - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Standing upon the intersection of traditional beamformers and deep neural networks, we
propose a causal neural beamformer paradigm called Embedding and Beamforming, and …

Multi-microphone complex spectral map** for utterance-wise and continuous speech separation

ZQ Wang, P Wang, DL Wang - IEEE/ACM transactions on …, 2021 - ieeexplore.ieee.org
We propose multi-microphone complex spectral map**, a simple way of applying deep
learning for time-varying non-linear beamforming, for speaker separation in reverberant …

Towards unified all-neural beamforming for time and frequency domain speech separation

R Gu, SX Zhang, Y Zou, D Yu - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org
Recently, frequency domain all-neural beamforming methods have achieved remarkable
progress for multichannel speech separation. In parallel, the integration of time domain …

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

J Hwang, M Hira, C Chen, X Zhang, Z Ni… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
TorchAudio is an open-source audio and speech processing library built for PyTorch. It aims
to accelerate the research and development of audio and speech technologies by providing …

[PDF][PDF] A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement.

X Ren, X Zhang, L Chen, X Zheng, C Zhang, L Guo… - Interspeech, 2021 - academia.edu
People are meeting through video conferencing more often. While single channel speech
enhancement techniques are useful for the individual participants, the speech quality will be …

Meta-AF: Meta-learning for adaptive filters

J Casebeer, NJ Bryan… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Adaptive filtering algorithms are pervasive throughout signal processing and have had a
material impact on a wide variety of domains including audio processing …

Generalized spatio-temporal RNN beamformer for target speech separation

Y Xu, Z Zhang, M Yu, SX Zhang, D Yu - arxiv preprint arxiv:2101.01280, 2021 - arxiv.org
Although the conventional mask-based minimum variance distortionless response (MVDR)
could reduce the non-linear distortion, the residual noise level of the MVDR separated …

X-tf-gridnet: A time–frequency domain target speaker extraction network with adaptive speaker embedding fusion

F Hao, X Li, C Zheng - Information Fusion, 2024 - Elsevier
Target speaker extraction (TSE) which has the capability to directly extract desired speech
given enrollment utterances of the target speaker has attracted more and more attention for …