- Academic Search

Q Miao, Y Lv, M Huang, X Wang… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org

The virtual-to-real paradigm, ie, training models on virtual data and then applying them to
solve real-world problems, has attracted more and more attention from various domains by …

Opslaan Citeren Geciteerd door 88 Verwante artikelen Alle 3 versies

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

On the adoption of modern technologies to fight the COVID-19 pandemic: a technical synthesis of latest developments

A Majeed, X Zhang - COVID, 2023 - mdpi.com

In the ongoing COVID-19 pandemic, digital technologies have played a vital role to minimize
the spread of COVID-19, and to control its pitfalls for the general public. Without such …

Opslaan Citeren Geciteerd door 13 Verwante artikelen Alle 4 versies In cache

[Free GPT-4]
[DeepSeek]

[PDF] google.com

A semi-supervised complementary joint training approach for low-resource speech recognition

YQ Du, J Zhang, X Fang, MH Wu… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org

Both unpaired speech and text have shown to be beneficial for low-resource automatic
speech recognition (ASR), which, however were either separately used for pre-training, self …

Opslaan Citeren Geciteerd door 4 Verwante artikelen Alle 2 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generating data with text-to-speech and large-language models for conversational speech recognition

S Cornell, J Darefsky, Z Duan, S Watanabe - arxiv preprint arxiv …, 2024 - arxiv.org

Currently, a common approach in many speech processing tasks is to leverage large scale
pre-trained models by fine-tuning them on in-domain data for a particular application. Yet …

Opslaan Citeren Geciteerd door 2 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition

N Rossenbach, R Schlüter, S Sakti - arxiv preprint arxiv:2407.21476, 2024 - arxiv.org

The rapid development of neural text-to-speech (TTS) systems enabled its usage in other
areas of natural language processing such as automatic speech recognition (ASR) or …

Opslaan Citeren Geciteerd door 3 Verwante artikelen Alle 7 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Text is all you need: Personalizing ASR models using controllable speech synthesis

K Yang, TY Hu, JHR Chang… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Adapting generic speech recognition models to specific individuals is a challenging problem
due to the scarcity of personalized data. Recent works have proposed boosting the amount …

Opslaan Citeren Geciteerd door 13 Verwante artikelen Alle 3 versies

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Phoneme hallucinator: One-shot voice conversion via set expansion

S Shan, Y Li, A Banerjee, JB Oliva - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Voice conversion (VC) aims at altering a person's voice to make it sound similar to the voice
of another person while preserving linguistic content. Existing methods suffer from a …

Opslaan Citeren Geciteerd door 5 Verwante artikelen Alle 3 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] kyoto-u.ac.jp

Refining Synthesized Speech Using Speaker Information and Phone Masking for Data Augmentation of Speech Recognition

S Ueno, A Lee, T Kawahara - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

While end-to-end automatic speech recognition (ASR) has shown impressive performance,
it requires a huge amount of speech and transcription data. The conversion of domain …

Opslaan Citeren Geciteerd door 1 Verwante artikelen Alle 4 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Can we use Common Voice to train a Multi-Speaker TTS system?

S Ogun, V Colotte, E Vincent - 2022 IEEE Spoken Language …, 2023 - ieeexplore.ieee.org

Training of multi-speaker text-to-speech (TTS) systems relies on curated datasets based on
high-quality recordings or audiobooks. Such datasets often lack speaker diversity and are …

Opslaan Citeren Geciteerd door 11 Verwante artikelen Alle 19 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the effect of purely synthetic training data for different automatic speech recognition architectures

B Hilmes, N Rossenbach - arxiv preprint arxiv:2407.17997, 2024 - arxiv.org

In this work we evaluate the utility of synthetic data for training automatic speech recognition
(ASR). We use the ASR training data to train a text-to-speech (TTS) system similar to …

Opslaan Citeren Geciteerd door 2 Verwante artikelen Alle 8 versies HTML-versie

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Synt++: Utilizing imperfect synthetic data to improve speech recognition

Parallel learning: Overview and perspective for computational learning across Syn2Real and Sim2Real

On the adoption of modern technologies to fight the COVID-19 pandemic: a technical synthesis of latest developments

A semi-supervised complementary joint training approach for low-resource speech recognition

Generating data with text-to-speech and large-language models for conversational speech recognition

On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition

Text is all you need: Personalizing ASR models using controllable speech synthesis

Phoneme hallucinator: One-shot voice conversion via set expansion

Refining Synthesized Speech Using Speaker Information and Phone Masking for Data Augmentation of Speech Recognition

Can we use Common Voice to train a Multi-Speaker TTS system?

On the effect of purely synthetic training data for different automatic speech recognition architectures