- Academic Search

DL Wang, J Chen - IEEE/ACM transactions on audio, speech …, 2018 - ieeexplore.ieee.org

Speech separation is the task of separating target speech from background interference.
Traditionally, speech separation is studied as a signal processing problem. A more recent …

Save Cite Cited by 1650 Related articles All 14 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org

Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

Save Cite Cited by 428 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Complex spectral map** for single-and multi-channel speech enhancement and robust ASR

ZQ Wang, P Wang, DL Wang - IEEE/ACM transactions on …, 2020 - ieeexplore.ieee.org

This study proposes a complex spectral map** approach for single-and multi-channel
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …

Save Cite Cited by 221 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] isca-archive.org

[PDF][PDF] Improved MVDR beamforming using single-channel mask prediction networks.

H Erdogan, JR Hershey, S Watanabe, MI Mandel… - Interspeech, 2016 - isca-archive.org

Recent studies on multi-microphone speech databases indicate that it is beneficial to
perform beamforming to improve speech recognition accuracies, especially when there is a …

Save Cite Cited by 384 Related articles All 14 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] merl.com

Multi-channel deep clustering: Discriminative spectral and spatial embeddings for speaker-independent speech separation

ZQ Wang, J Le Roux, JR Hershey - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

The recently-proposed deep clustering algorithm represents a fundamental advance
towards solving the cocktail party problem in the single-channel case. When multiple …

Save Cite Cited by 282 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Far-field automatic speech recognition

R Haeb-Umbach, J Heymann, L Drude… - Proceedings of the …, 2020 - ieeexplore.ieee.org

The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …

Save Cite Cited by 121 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Neural spectrospatial filtering

K Tan, ZQ Wang, DL Wang - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org

As the most widely-used spatial filtering approach for multi-channel speech separation,
beamforming extracts the target speech signal arriving from a specific direction. An …

Save Cite Cited by 65 Related articles All 5 versions Free GPT-4

The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices

T Yoshioka, N Ito, M Delcroix, A Ogawa… - … IEEE Workshop on …, 2015 - ieeexplore.ieee.org

CHiME-3 is a research community challenge organised in 2015 to evaluate speech
recognition systems for mobile multi-microphone devices used in noisy daily environments …

Save Cite Cited by 271 Related articles All 2 versions Free GPT-4

The first multimodal information based speech processing (misp) challenge: Data, tasks, baselines and results

H Chen, H Zhou, J Du, CH Lee, J Chen… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

In this paper we discuss the rational of the Multi-model Information based Speech
Processing (MISP) Challenge, and provide a detailed description of the data recorded, the …

Save Cite Cited by 54 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A four-stage data augmentation approach to resnet-conformer based acoustic modeling for sound event localization and detection

Q Wang, J Du, HX Wu, J Pan, F Ma… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org

In this paper, we propose a novel four-stage data augmentation approach to ResNet-
Conformer based acoustic modeling for sound event localization and detection (SELD) …

Save Cite Cited by 109 Related articles All 4 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise

Supervised speech separation based on deep learning: An overview

Deep learning for environmentally robust speech recognition: An overview of recent developments

Complex spectral map** for single-and multi-channel speech enhancement and robust ASR

[PDF][PDF] Improved MVDR beamforming using single-channel mask prediction networks.

Multi-channel deep clustering: Discriminative spectral and spatial embeddings for speaker-independent speech separation

Far-field automatic speech recognition

Neural spectrospatial filtering

The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices

The first multimodal information based speech processing (misp) challenge: Data, tasks, baselines and results

A four-stage data augmentation approach to resnet-conformer based acoustic modeling for sound event localization and detection