Decentralizing feature extraction with quantum convolutional neural network for automatic speech recognition

CHH Yang, J Qi, SYC Chen, PY Chen… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
We propose a novel decentralized feature extraction approach in federated learning to
address privacy-preservation issues for speech recognition. It is built upon a quantum …

Whispering llama: A cross-modal generative error correction framework for speech recognition

S Radhakrishnan, CHH Yang, SA Khan… - arxiv preprint arxiv …, 2023 - arxiv.org
We introduce a new cross-modal fusion technique designed for generative error correction
in automatic speech recognition (ASR). Our methodology leverages both acoustic …

Low-rank adaptation of large language model rescoring for parameter-efficient speech recognition

Y Yu, CHH Yang, J Kolehmainen… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
We propose a neural language modeling system based on low-rank adaptation (LoRA) for
speech recognition output rescoring. Although pretrained language models (LMs) like BERT …

Large language models are efficient learners of noise-robust speech recognition

Y Hu, C Chen, CHH Yang, R Li, C Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advances in large language models (LLMs) have promoted generative error
correction (GER) for automatic speech recognition (ASR), which leverages the rich linguistic …

When bert meets quantum temporal convolution learning for text classification in heterogeneous computing

CHH Yang, J Qi, SYC Chen, Y Tsao… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
The rapid development of quantum computing has demonstrated many unique
characteristics of quantum advantages, such as richer feature representation and more …

GenTranslate: Large language models are generative multilingual speech and machine translators

Y Hu, C Chen, CHH Yang, R Li, D Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advances in large language models (LLMs) have stepped forward the development
of multilingual speech and machine translation by its reduced representation errors and …

Parameter-efficient learning for text-to-speech accent adaptation

LJ Yang, CHH Yang, JT Chien - arxiv preprint arxiv:2305.11320, 2023 - arxiv.org
This paper presents a parameter-efficient learning (PEL) to develop a low-resource accent
adaptation for text-to-speech (TTS). A resource-efficient adaptation from a frozen pre-trained …

A lottery ticket hypothesis framework for low-complexity device-robust neural acoustic scene classification

H Yen, CHH Yang, H Hu, SM Siniscalchi… - arxiv preprint arxiv …, 2021 - arxiv.org
We propose a novel neural model compression strategy combining data augmentation,
knowledge transfer, pruning, and quantization for device-robust acoustic scene classification …

Procter: Pronunciation-aware contextual adapter for personalized speech recognition in neural transducers

R Pandey, R Ren, Q Luo, J Liu… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
End-to-End (E2E) automatic speech recognition (ASR) systems used in voice assistants
often have difficulties recognizing infrequent words personalized to the user, such as names …

Pinyin regularization in error correction for chinese speech recognition with large language models

Z Tang, D Wang, S Huang, S Shang - arxiv preprint arxiv:2407.01909, 2024 - arxiv.org
Recent studies have demonstrated the efficacy of large language models (LLMs) in error
correction for automatic speech recognition (ASR). However, much of the research focuses …