Past review, current progress, and challenges ahead on the cocktail party problem
The cocktail party problem, ie, tracing and recognizing the speech of a specific speaker
when multiple speakers talk simultaneously, is one of the critical problems yet to be solved …
when multiple speakers talk simultaneously, is one of the critical problems yet to be solved …
Recent developments in speech enhancement in the short-time Fourier transform domain
In this paper, we present an overview on the topic of noise reduction in the short-time Fourier
transform (STFT) domain. First, we briefly review the conventional literature in the single-and …
transform (STFT) domain. First, we briefly review the conventional literature in the single-and …
A consolidated perspective on multimicrophone speech enhancement and source separation
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …
commercial applications in devices as diverse as mobile phones, conference call systems …
[PDF][PDF] Improved MVDR beamforming using single-channel mask prediction networks.
Recent studies on multi-microphone speech databases indicate that it is beneficial to
perform beamforming to improve speech recognition accuracies, especially when there is a …
perform beamforming to improve speech recognition accuracies, especially when there is a …
Multi-channel deep clustering: Discriminative spectral and spatial embeddings for speaker-independent speech separation
The recently-proposed deep clustering algorithm represents a fundamental advance
towards solving the cocktail party problem in the single-channel case. When multiple …
towards solving the cocktail party problem in the single-channel case. When multiple …
Far-field automatic speech recognition
The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …
far-field automatic speech recognition (ASR), has received a significant increase in attention …
Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise
This paper considers acoustic beamforming for noise robust automatic speech recognition
(ASR). A beamformer attenuates background noise by enhancing sound components …
(ASR). A beamformer attenuates background noise by enhancing sound components …
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition
Multi-source localization is an important and challenging technique for multi-talker
conversation analysis. This paper proposes a novel supervised learning method using deep …
conversation analysis. This paper proposes a novel supervised learning method using deep …
ESPnet-SE: End-to-end speech enhancement and separation toolkit designed for ASR integration
We present ESPnet-SE, which is designed for the quick development of speech
enhancement and speech separation systems in a single framework, along with the optional …
enhancement and speech separation systems in a single framework, along with the optional …
[PDF][PDF] Front-end processing for the CHiME-5 dinner party scenario
This contribution presents a speech enhancement system for the CHiME-5 Dinner Party
Scenario. The front-end employs multi-channel linear time-variant filtering and achieves its …
Scenario. The front-end employs multi-channel linear time-variant filtering and achieves its …