Neural target speech extraction: An overview

K Zmolikova, M Delcroix, T Ochiai… - IEEE Signal …, 2023 - ieeexplore.ieee.org
Humans can listen to a target speaker even in challenging acoustic conditions that have
noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail …

A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

SpeechBrain: A general-purpose speech toolkit

M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - ar** for single-and multi-channel speech enhancement and robust ASR
ZQ Wang, P Wang, DL Wang - IEEE/ACM transactions on …, 2020 - ieeexplore.ieee.org
This study proposes a complex spectral map** approach for single-and multi-channel
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …

Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

Speakerbeam: Speaker aware neural network for target speaker extraction in speech mixtures

K Žmolíková, M Delcroix, K Kinoshita… - IEEE Journal of …, 2019 - ieeexplore.ieee.org
The processing of speech corrupted by interfering overlap** speakers is one of the
challenging problems with regards to today's automatic speech recognition systems …

Multi-channel deep clustering: Discriminative spectral and spatial embeddings for speaker-independent speech separation

ZQ Wang, J Le Roux, JR Hershey - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
The recently-proposed deep clustering algorithm represents a fundamental advance
towards solving the cocktail party problem in the single-channel case. When multiple …

End-to-end microphone permutation and number invariant multi-channel speech separation

Y Luo, Z Chen, N Mesgarani… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
An important problem in ad-hoc microphone speech separation is how to guarantee the
robustness of a system with respect to the locations and numbers of microphones. The …