A review of speaker diarization: Recent advances with deep learning
Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …
Supervised speech separation based on deep learning: An overview
Speech separation is the task of separating target speech from background interference.
Traditionally, speech separation is studied as a signal processing problem. A more recent …
Traditionally, speech separation is studied as a signal processing problem. A more recent …
SpeechBrain: A general-purpose speech toolkit
M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - ar** for single-and multi-channel speech enhancement and robust ASR
This study proposes a complex spectral map** approach for single-and multi-channel
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …
Multi-channel deep clustering: Discriminative spectral and spatial embeddings for speaker-independent speech separation
The recently-proposed deep clustering algorithm represents a fundamental advance
towards solving the cocktail party problem in the single-channel case. When multiple …
towards solving the cocktail party problem in the single-channel case. When multiple …
Speakerbeam: Speaker aware neural network for target speaker extraction in speech mixtures
The processing of speech corrupted by interfering overlap** speakers is one of the
challenging problems with regards to today's automatic speech recognition systems …
challenging problems with regards to today's automatic speech recognition systems …
End-to-end microphone permutation and number invariant multi-channel speech separation
An important problem in ad-hoc microphone speech separation is how to guarantee the
robustness of a system with respect to the locations and numbers of microphones. The …
robustness of a system with respect to the locations and numbers of microphones. The …