Google Académico

Y Li, Y Wu, J Li, S Liu - 2023 IEEE Automatic Speech …, 2023 - ieeexplore.ieee.org

The integration of Language Models (LMs) has proven to be an effective way to address
domain shifts in speech recognition. However, these approaches usually require a …

Guardar Citar Citado por 39 Artículos relacionados Las 4 versiones

[Free GPT-4]

[PDF] arxiv.org

Adaptable end-to-end ASR models using replaceable internal LMs and residual softmax

K Deng, PC Woodland - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org

End-to-end (E2E) automatic speech recognition (ASR) implicitly learns the token sequence
distribution of paired audio-transcript training data. However, it still suffers from domain shifts …

Guardar Citar Citado por 12 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] arxiv.org

Label-synchronous neural transducer for end-to-end ASR

K Deng, PC Woodland - arxiv preprint arxiv:2307.03088, 2023 - arxiv.org

Neural transducers provide a natural approach to streaming ASR. However, they augment
output sequences with blank tokens which leads to challenges for domain adaptation using …

Guardar Citar Citado por 5 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Decoder-only architecture for speech recognition with ctc prompts and text data augmentation

E Tsunoo, H Futami, Y Kashiwagi, S Arora… - arxiv preprint arxiv …, 2023 - arxiv.org

Collecting audio-text pairs is expensive; however, it is much easier to access text-only data.
Unless using shallow fusion, end-to-end automatic speech recognition (ASR) models …

Guardar Citar Citado por 9 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Decoupled structure for improved adaptability of end-to-end models

K Deng, PC Woodland - Speech Communication, 2024 - Elsevier

Although end-to-end (E2E) trainable automatic speech recognition (ASR) has shown great
success by jointly learning acoustic and linguistic information, it still suffers from the effect of …

Guardar Citar Citado por 2 Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] arxiv.org

Zero-Shot Domain-Sensitive Speech Recognition with Prompt-Conditioning Fine-Tuning

FT Liao, YC Chan, YC Chen, CJ Hsu… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

In this work, we propose a method to create domain-sensitive speech recognition models
that utilize textual domain information by conditioning its generation on a given text prompt …

Guardar Citar Citado por 7 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] ieee.org

Label-synchronous neural transducer for adaptable online E2E speech recognition

K Deng, PC Woodland - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

Although end-to-end (E2E) automatic speech recognition (ASR) has shown state-of-the-art
recognition accuracy, it tends to be implicitly biased towards the training data distribution …

Guardar Citar Citado por 1 Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] arxiv.org

Hybrid Attention-Based Encoder-Decoder Model for Efficient Language Model Adaptation

S Ling, G Ye, R Zhao, Y Gong - 2024 IEEE Spoken Language …, 2024 - ieeexplore.ieee.org

The attention-based encoder-decoder (AED) speech recognition model has been widely
successful in recent years. However, the joint optimization of acoustic model and language …

Guardar Citar Citado por 1 Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] arxiv.org

FastInject: Injecting Unpaired Text Data into CTC-Based ASR Training

K Deng, PC Woodland - ICASSP 2024-2024 IEEE International …, 2024 - ieeexplore.ieee.org

Recently, connectionist temporal classification (CTC)-based end-to-end (E2E) automatic
speech recognition (ASR) models have achieved impressive results, especially with the …

Guardar Citar Citado por 1 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] techscience.cn

[PDF][PDF] Joint On-Demand Pruning and Online Distillation in Automatic Speech Recognition Language Model Optimization.

S Seo, JH Kim - Computers, Materials & Continua, 2023 - cdn.techscience.cn

Automatic speech recognition (ASR) systems have emerged as indispensable tools across a
wide spectrum of applications, ranging from transcription services to voice-activated …

Guardar Citar Artículos relacionados Las 2 versiones Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Residual language model for end-to-end speech recognition

Prompting large language models for zero-shot domain adaptation in speech recognition

Adaptable end-to-end ASR models using replaceable internal LMs and residual softmax

Label-synchronous neural transducer for end-to-end ASR

Decoder-only architecture for speech recognition with ctc prompts and text data augmentation

[HTML][HTML] Decoupled structure for improved adaptability of end-to-end models

Zero-Shot Domain-Sensitive Speech Recognition with Prompt-Conditioning Fine-Tuning

Label-synchronous neural transducer for adaptable online E2E speech recognition

Hybrid Attention-Based Encoder-Decoder Model for Efficient Language Model Adaptation

FastInject: Injecting Unpaired Text Data into CTC-Based ASR Training

[PDF][PDF] Joint On-Demand Pruning and Online Distillation in Automatic Speech Recognition Language Model Optimization.