- Academic Search

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Enregistrer Citer Cité 440 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] ieee.org

Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org

We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

Enregistrer Citer Cité 102 fois Autres articles Les 7 versions Free GPT-4

[Free GPT-4]

[PDF] jmlr.org

Scaling speech technology to 1,000+ languages

V Pratap, A Tjandra, B Shi, P Tomasello, A Babu… - Journal of Machine …, 2024 - jmlr.org

Expanding the language coverage of speech technology has the potential to improve
access to information for many more people. However, current speech technology is …

Enregistrer Citer Cité 293 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

Robust speech recognition via large-scale weak supervision

A Radford, JW Kim, T Xu, G Brockman… - International …, 2023 - proceedings.mlr.press

We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …

Enregistrer Citer Cité 3834 fois Autres articles Les 11 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Unsupervised cross-lingual representation learning for speech recognition

A Conneau, A Baevski, R Collobert… - arxiv preprint arxiv …, 2020 - arxiv.org

This paper presents XLSR which learns cross-lingual speech representations by pretraining
a single model from the raw waveform of speech in multiple languages. We build on …

Enregistrer Citer Cité 890 fois Autres articles Les 10 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] inaoep.mx

Automatic speech recognition: a survey

M Malik, MK Malik, K Mehmood… - Multimedia Tools and …, 2021 - Springer

Recently great strides have been made in the field of automatic speech recognition (ASR) by
using various deep learning techniques. In this study, we present a thorough comparison …

Enregistrer Citer Cité 383 fois Autres articles Les 8 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Improving continuous sign language recognition with cross-lingual signs

F Wei, Y Chen - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com

This work dedicates to continuous sign language recognition (CSLR), which is a weakly
supervised task dealing with the recognition of continuous signs from videos, without any …

Enregistrer Citer Cité 30 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Massively multilingual ASR: 50 languages, 1 model, 1 billion parameters

V Pratap, A Sriram, P Tomasello, A Hannun… - arxiv preprint arxiv …, 2020 - arxiv.org

We study training a single acoustic model for multiple languages with the aim of improving
automatic speech recognition (ASR) performance on low-resource languages, and over-all …

Enregistrer Citer Cité 161 fois Autres articles Les 8 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Large-scale multilingual speech recognition with a streaming end-to-end model

A Kannan, A Datta, TN Sainath, E Weinstein… - arxiv preprint arxiv …, 2019 - arxiv.org

Multilingual end-to-end (E2E) models have shown great promise in expansion of automatic
speech recognition (ASR) coverage of the world's languages. They have shown …

Enregistrer Citer Cité 200 fois Autres articles Les 9 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Lingvo: a modular and scalable framework for sequence-to-sequence modeling

J Shen, P Nguyen, Y Wu, Z Chen, MX Chen… - arxiv preprint arxiv …, 2019 - arxiv.org

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep
learning research, with a particular focus towards sequence-to-sequence models. Lingvo …

Enregistrer Citer Cité 214 fois Autres articles Les 6 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Multilingual speech recognition with a single end-to-end model

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

Adaptation algorithms for neural network-based speech recognition: An overview

Scaling speech technology to 1,000+ languages

Robust speech recognition via large-scale weak supervision

Unsupervised cross-lingual representation learning for speech recognition

Automatic speech recognition: a survey

Improving continuous sign language recognition with cross-lingual signs

Massively multilingual ASR: 50 languages, 1 model, 1 billion parameters

Large-scale multilingual speech recognition with a streaming end-to-end model

Lingvo: a modular and scalable framework for sequence-to-sequence modeling