Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org
Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

MMTM: Multimodal transfer module for CNN fusion

HRV Joze, A Shaban, ML Iuzzolino… - Proceedings of the …, 2020 - openaccess.thecvf.com
In late fusion, each modality is processed in a separate unimodal Convolutional Neural
Network (CNN) stream and the scores of each modality are fused at the end. Due to its …

Light gated recurrent units for speech recognition

M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …

The fifth'CHiME'speech separation and recognition challenge: dataset, task and baselines

J Barker, S Watanabe, E Vincent, J Trmal - arxiv preprint arxiv …, 2018 - arxiv.org
The CHiME challenge series aims to advance robust automatic speech recognition (ASR)
technology by promoting research at the interface of speech and language processing …

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

S Cornell, M Wiesner, S Watanabe, D Raj… - arxiv preprint arxiv …, 2023 - arxiv.org
The CHiME challenges have played a significant role in the development and evaluation of
robust automatic speech recognition (ASR) systems. We introduce the CHiME-7 distant ASR …

Detection and classification of acoustic scenes and events: Outcome of the DCASE 2016 challenge

A Mesaros, T Heittola, E Benetos… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org
Public evaluation campaigns and datasets promote active development in target research
areas, allowing direct comparison of algorithms. The second edition of the challenge on …

A consolidated perspective on multimicrophone speech enhancement and source separation

S Gannot, E Vincent… - … /ACM Transactions on …, 2017 - ieeexplore.ieee.org
Speech enhancement and separation are core problems in audio signal processing, with
commercial applications in devices as diverse as mobile phones, conference call systems …

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

E Vincent, S Watanabe, AA Nugraha, J Barker… - Computer Speech & …, 2017 - Elsevier
Speech enhancement and automatic speech recognition (ASR) are most often evaluated in
matched (or multi-condition) settings where the acoustic conditions of the training data …

The 2018 signal separation evaluation campaign

FR Stöter, A Liutkus, N Ito - … Variable Analysis and Signal Separation: 14th …, 2018 - Springer
This paper reports the organization and results for the 2018 community-based Signal
Separation Evaluation Campaign (SiSEC 2018). This year's edition was focused on audio …

Acoustic scene classification: Classifying environments from the sounds they produce

D Barchiesi, D Giannoulis, D Stowell… - IEEE Signal …, 2015 - ieeexplore.ieee.org
In this article, we present an account of the state of the art in acoustic scene classification
(ASC), the task of classifying environments from the sounds they produce. Starting from a …