Google Učenjak

KB Bhangale, M Kothandaraman - Wireless Personal Communications, 2022 - Springer

Over the past decades, a particular focus is given to research on machine learning
techniques for speech processing applications. However, in the past few years, research …

Shrani Navedi Navedeno v 96 virih Sorodni članki Vse različice: 6

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Deep learning for environmentally robust speech recognition: An overview of recent developments

Z Zhang, J Geiger, J Pohjalainen, AED Mousa… - ACM Transactions on …, 2018 - dl.acm.org

Eliminating the negative effect of non-stationary environmental noise is a long-standing
research topic for automatic speech recognition but still remains an important challenge …

Shrani Navedi Navedeno v 424 virih Sorodni članki Vse različice: 11

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

TF-GridNet: Integrating full-and sub-band modeling for speech separation

ZQ Wang, S Cornell, S Choi, Y Lee… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

We propose TF-GridNet for speech separation. The model is a novel deep neural network
(DNN) integrating full-and sub-band modeling in the time-frequency (TF) domain. It stacks …

Shrani Navedi Navedeno v 108 virih Sorodni članki Vse različice: 8

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Attentive statistics pooling for deep speaker embedding

K Okabe, T Koshinaka, K Shinoda - arxiv preprint arxiv:1803.10963, 2018 - arxiv.org

This paper proposes attentive statistics pooling for deep speaker embedding in text-
independent speaker verification. In conventional speaker embedding, frame-level features …

Shrani Navedi Navedeno v 674 virih Sorodni članki Vse različice: 8 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] neurips.cc

Soundspaces 2.0: A simulation platform for visual-acoustic learning

C Chen, C Schissler, S Garg… - Advances in …, 2022 - proceedings.neurips.cc

Abstract We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based audio
rendering for 3D environments. Given a 3D mesh of a real-world environment …

Shrani Navedi Navedeno v 89 virih Sorodni članki Vse različice: 8 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Fullsubnet: A full-band and sub-band fusion model for real-time single-channel speech enhancement

X Hao, X Su, R Horaud, X Li - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

This paper proposes a full-band and sub-band fusion model, named as FullSubNet, for
single-channel real-time speech enhancement. Full-band and sub-band refer to the models …

Shrani Navedi Navedeno v 243 virih Sorodni članki Vse različice: 26

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

CMGAN: Conformer-based metric GAN for speech enhancement

R Cao, S Abdulatif, B Yang - arxiv preprint arxiv:2203.15149, 2022 - arxiv.org

Recently, convolution-augmented transformer (Conformer) has achieved promising
performance in automatic speech recognition (ASR) and time-domain speech enhancement …

Shrani Navedi Navedeno v 117 virih Sorodni članki Vse različice: 6 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] tuni.fi

Detection and classification of acoustic scenes and events: Outcome of the DCASE 2016 challenge

A Mesaros, T Heittola, E Benetos… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org

Public evaluation campaigns and datasets promote active development in target research
areas, allowing direct comparison of algorithms. The second edition of the challenge on …

Shrani Navedi Navedeno v 389 virih Sorodni članki Vse različice: 8

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

HiFi-GAN: High-fidelity denoising and dereverberation based on speech deep features in adversarial networks

J Su, Z **, A Finkelstein - arxiv preprint arxiv:2006.05694, 2020 - arxiv.org

Real-world audio recordings are often degraded by factors such as noise, reverberation,
and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to …

Shrani Navedi Navedeno v 179 virih Sorodni članki Vse različice: 8 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

M2MeT: The ICASSP 2022 multi-channel multi-party meeting transcription challenge

F Yu, S Zhang, Y Fu, L **e, S Zheng… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Recent development of speech signal processing, such as speech recognition, speaker
diarization, etc., has inspired numerous applications of speech technologies. The meeting …

Shrani Navedi Navedeno v 103 virih Sorodni članki Vse različice: 3

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant...

Survey of deep learning paradigms for speech processing

Deep learning for environmentally robust speech recognition: An overview of recent developments

TF-GridNet: Integrating full-and sub-band modeling for speech separation

Attentive statistics pooling for deep speaker embedding

Soundspaces 2.0: A simulation platform for visual-acoustic learning

Fullsubnet: A full-band and sub-band fusion model for real-time single-channel speech enhancement

CMGAN: Conformer-based metric GAN for speech enhancement

Detection and classification of acoustic scenes and events: Outcome of the DCASE 2016 challenge

HiFi-GAN: High-fidelity denoising and dereverberation based on speech deep features in adversarial networks

M2MeT: The ICASSP 2022 multi-channel multi-party meeting transcription challenge