- Academic Search

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Speichern Zitieren Zitiert von: 440 Ähnliche Artikel Alle 7 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

SpeechBrain: A general-purpose speech toolkit

M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - arxiv preprint arxiv …, 2021 - arxiv.org

SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …

Speichern Zitieren Zitiert von: 747 Ähnliche Artikel Alle 5 Versionen HTML-Version

[Free GPT-4]

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

Speichern Zitieren Zitiert von: 176 Ähnliche Artikel Alle 6 Versionen

[Free GPT-4]

[PDF] arxiv.org

Torchaudio: Building blocks for audio and speech processing

YY Yang, M Hira, Z Ni, A Astafurov… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

This document describes version 0.10 of TorchAudio: building blocks for machine learning
applications in the audio and speech processing domain. The objective of TorchAudio is to …

Speichern Zitieren Zitiert von: 216 Ähnliche Artikel Alle 7 Versionen

[Free GPT-4]

[PDF] arxiv.org

Wav2vec-switch: Contrastive learning from original-noisy speech pairs for robust speech recognition

Y Wang, J Li, H Wang, Y Qian… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

The goal of self-supervised learning (SSL) for automatic speech recognition (ASR) is to
learn good speech representations from a large amount of unlabeled speech for the …

Speichern Zitieren Zitiert von: 71 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]

[PDF] researchgate.net

[PDF][PDF] End-to-end arabic speech recognition: A review

AA Abdelhamid, HA Alsayadi, I Hegazy… - Proceedings of the …, 2020 - researchgate.net

Automatic speech recognition (ASR) is a crucial field of science due to its massive
applications that can be developed to help humans to improve their daily life tasks. Despite …

Speichern Zitieren Zitiert von: 30 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

The 2020 espnet update: new features, broadened applications, performance improvements, and future plans

S Watanabe, F Boyer, X Chang, P Guo… - 2021 IEEE Data …, 2021 - ieeexplore.ieee.org

This paper describes the recent development of ESPnet (https://github. com/espnet/espnet),
an end-to-end speech processing toolkit. This project was initiated in December 2017 to …

Speichern Zitieren Zitiert von: 57 Ähnliche Artikel Alle 7 Versionen

[Free GPT-4]

[PDF] wiley.com Full View

Arabic speech recognition using end‐to‐end deep learning

HA Alsayadi, AA Abdelhamid, I Hegazy… - IET Signal …, 2021 - Wiley Online Library

Arabic automatic speech recognition (ASR) methods with diacritics have the ability to be
integrated with other systems better than Arabic ASR methods without diacritics. In this work …

Speichern Zitieren Zitiert von: 58 Ähnliche Artikel Alle 8 Versionen

[Free GPT-4]

[PDF] mlr.press

Efficient sequence transduction by jointly predicting tokens and durations

H Xu, F Jia, S Majumdar, H Huang… - International …, 2023 - proceedings.mlr.press

This paper introduces a novel Token-and-Duration Transducer (TDT) architecture for
sequence-to-sequence tasks. TDT extends conventional RNN-Transducer architectures by …

Speichern Zitieren Zitiert von: 12 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Wake word detection with streaming transformers

Y Wang, H Lv, D Povey, L **e… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Modern wake word detection systems usually rely on neural networks for acoustic modeling.
Transformers has recently shown superior performance over LSTM and convolutional …

Speichern Zitieren Zitiert von: 43 Ähnliche Artikel Alle 9 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Espresso: A fast end-to-end neural speech recognition toolkit

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

SpeechBrain: A general-purpose speech toolkit

End-to-end speech recognition: A survey

Torchaudio: Building blocks for audio and speech processing

Wav2vec-switch: Contrastive learning from original-noisy speech pairs for robust speech recognition

[PDF][PDF] End-to-end arabic speech recognition: A review

The 2020 espnet update: new features, broadened applications, performance improvements, and future plans

Arabic speech recognition using end‐to‐end deep learning

Efficient sequence transduction by jointly predicting tokens and durations

Wake word detection with streaming transformers