Multi-microphone complex spectral map** for utterance-wise and continuous speech separation

ZQ Wang, P Wang, DL Wang - IEEE/ACM transactions on …, 2021 - ieeexplore.ieee.org
We propose multi-microphone complex spectral map**, a simple way of applying deep
learning for time-varying non-linear beamforming, for speaker separation in reverberant …

Towards unified all-neural beamforming for time and frequency domain speech separation

R Gu, SX Zhang, Y Zou, D Yu - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org
Recently, frequency domain all-neural beamforming methods have achieved remarkable
progress for multichannel speech separation. In parallel, the integration of time domain …

Multi-channel talker-independent speaker separation through location-based training

H Taherian, K Tan, DL Wang - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org
Permutation ambiguity is a crucial issue for deep learning based talker-independent
speaker separation. Deep clustering and permutation invariant training (PIT) have been …

Audio-visual end-to-end multi-channel speech separation, dereverberation and recognition

G Li, J Deng, M Geng, Z **, T Wang… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
Accurate recognition of cocktail party speech containing overlap** speakers, noise and
reverberation remains a highly challenging task to date. Motivated by the invariance of …

Multi-channel speech separation using spatially selective deep non-linear filters

K Tesch, T Gerkmann - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
In a multi-channel separation task with multiple speakers, we aim to recover all individual
speech signals from the mixture. In contrast to single-channel approaches, which rely on the …

End-to-end dereverberation, beamforming, and speech recognition in a cocktail party

W Zhang, X Chang, C Boeddeker… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
Far-field multi-speaker automatic speech recognition (ASR) has drawn increasing attention
in recent years. Most existing methods feature a signal processing frontend and an ASR …

A novel approach to multi-channel speech enhancement based on graph neural networks

HN Chau, TD Bui, HB Nguyen… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
Multi-channel speech enhancement aims at utilizing spatial relationships between signals
captured from a microphone array along with temporal-spectral information efficiently to …

Closing the gap between time-domain multi-channel speech enhancement on real and simulation conditions

W Zhang, J Shi, C Li, S Watanabe… - 2021 IEEE Workshop on …, 2021 - ieeexplore.ieee.org
The deep learning based time-domain models, eg Conv-TasNet, have shown great potential
in both single-channel and multi-channel speech enhancement. However, many …

Implicit neural spatial filtering for multichannel source separation in the waveform domain

D Markovic, A Defossez, A Richard - arxiv preprint arxiv:2206.15423, 2022 - arxiv.org
We present a single-stage casual waveform-to-waveform multichannel model that can
separate moving sound sources based on their broad spatial locations in a dynamic …

A time-domain real-valued generalized wiener filter for multi-channel neural separation systems

Y Luo - IEEE/ACM Transactions on Audio, Speech, and …, 2022 - ieeexplore.ieee.org
Frequency-domain beamformers have been successful in a wide range of multi-channel
neural separation systems in the past years. However, the operations in conventional …