- Academic Search

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Simpan Kutip Dirujuk 440 kali Artikel terkait 7 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

XLS-R: Self-supervised cross-lingual speech representation learning at scale

A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper presents XLS-R, a large-scale model for cross-lingual speech representation
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …

Simpan Kutip Dirujuk 720 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unsupervised cross-lingual representation learning for speech recognition

A Conneau, A Baevski, R Collobert… - arxiv preprint arxiv …, 2020 - arxiv.org

This paper presents XLSR which learns cross-lingual speech representations by pretraining
a single model from the raw waveform of speech in multiple languages. We build on …

Simpan Kutip Dirujuk 892 kali Artikel terkait 10 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Applying wav2vec2. 0 to speech recognition in various low-resource languages

C Yi, J Wang, N Cheng, S Zhou, B Xu - arxiv preprint arxiv:2012.12121, 2020 - arxiv.org

There are several domains that own corresponding widely used feature extractors, such as
ResNet, BERT, and GPT-x. These models are usually pre-trained on large amounts of …

Simpan Kutip Dirujuk 100 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Automatic speech recognition for Uyghur, Kazakh, and Kyrgyz: An overview

W Du, Y Maimaitiyiming, M Nijat, L Li, A Hamdulla… - Applied Sciences, 2022 - mdpi.com

With the emergence of deep learning, the performance of automatic speech recognition
(ASR) systems has remarkably improved. Especially for resource-rich languages such as …

Simpan Kutip Dirujuk 14 kali Artikel terkait 3 versi Cache

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multilingual end-to-end speech translation

H Inaguma, K Duh, T Kawahara… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

In this paper, we propose a simple yet effective framework for multilingual end-to-end
speech translation (ST), in which speech utterances in source languages are directly …

Simpan Kutip Dirujuk 99 kali Artikel terkait 11 versi

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Leveraging modality-specific representations for audio-visual speech recognition via reinforcement learning

C Chen, Y Hu, Q Zhang, H Zou, B Zhu… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

Audio-visual speech recognition (AVSR) has gained remarkable success for ameliorating
the noise-robustness of speech recognition. Mainstream methods focus on fusing audio and …

Simpan Kutip Dirujuk 28 kali Artikel terkait 4 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Massively multilingual adversarial speech recognition

O Adams, M Wiesner, S Watanabe… - arxiv preprint arxiv …, 2019 - arxiv.org

We report on adaptation of multilingual end-to-end speech recognition models trained on as
many as 100 languages. Our findings shed light on the relative importance of similarity …

Simpan Kutip Dirujuk 89 kali Artikel terkait 7 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Hierarchical transfer learning for multilingual, multi-speaker, and style transfer DNN-based TTS on low-resource languages

K Azizah, M Adriani, W Jatmiko - IEEE Access, 2020 - ieeexplore.ieee.org

This work applies a hierarchical transfer learning to implement deep neural network (DNN)-
based multilingual text-to-speech (TTS) for low-resource languages. DNN-based system …

Simpan Kutip Dirujuk 37 kali Artikel terkait 3 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Xtreme-s: Evaluating cross-lingual speech representations

A Conneau, A Bapna, Y Zhang, M Ma… - arxiv preprint arxiv …, 2022 - arxiv.org

We introduce XTREME-S, a new benchmark to evaluate universal cross-lingual speech
representations in many languages. XTREME-S covers four task families: speech …

Simpan Kutip Dirujuk 22 kali Artikel terkait 8 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Transfer learning of language-independent end-to-end ASR with language model fusion

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

XLS-R: Self-supervised cross-lingual speech representation learning at scale

Unsupervised cross-lingual representation learning for speech recognition

Applying wav2vec2. 0 to speech recognition in various low-resource languages

Automatic speech recognition for Uyghur, Kazakh, and Kyrgyz: An overview

Multilingual end-to-end speech translation

Leveraging modality-specific representations for audio-visual speech recognition via reinforcement learning

Massively multilingual adversarial speech recognition

Hierarchical transfer learning for multilingual, multi-speaker, and style transfer DNN-based TTS on low-resource languages

Xtreme-s: Evaluating cross-lingual speech representations