Large language models in medical and healthcare fields: applications, advances, and challenges

D Wang, S Zhang - Artificial Intelligence Review, 2024 - Springer
Large language models (LLMs) are increasingly recognized for their advanced language
capabilities, offering significant assistance in diverse areas like medical communication …

Hyporadise: An open baseline for generative speech recognition with large language models

C Chen, Y Hu, CHH Yang… - Advances in …, 2024 - proceedings.neurips.cc
Advancements in deep neural networks have allowed automatic speech recognition (ASR)
systems to attain human parity on several publicly available clean speech datasets …

Enhancing Conversation Smoothness in Language Learning Chatbots: An Evaluation of GPT4 for ASR Error Correction

L Mai, J Carson-Berndsen - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
The integration of natural language processing (NLP) technologies into educational
applications has shown promising results, particularly in the language learning domain …

WavePurifier: Purifying Audio Adversarial Examples via Hierarchical Diffusion Models

H Guo, G Wang, B Chen, Y Wang, X Zhang… - Proceedings of the 30th …, 2024 - dl.acm.org
In this paper, we propose WavePurifier, an audio purification framework to defend against
audio adversarial attacks. Audio adversarial attacks craft adversarial examples or …

[PDF][PDF] I Learned Error, I Can Fix It!: A Detector-Corrector Structure for ASR Error Calibration

HY Yeen, MJ Kim, MW Koo - Proc. INTERSPEECH, 2023 - isca-archive.org
Speech recognition technology has improved recently. However, in the context of spoken
language understanding (SLU), containing automatic speech recognition (ASR) errors …

Evaluating Open-Source ASR Systems: Performance Across Diverse Audio Conditions and Error Correction Methods

S Imai, T Chowdhury, A Stent - Proceedings of the 31st …, 2025 - aclanthology.org
Despite significant advances in automatic speech recognition (ASR) accuracy, challenges
remain. Naturally occurring conversation often involves multiple overlap** speakers, of …

Residual adapters for targeted updates in rnn-transducer based speech recognition system

S Han, D Baby, V Mendelev - 2022 IEEE Spoken Language …, 2023 - ieeexplore.ieee.org
This paper investigates an approach for adapting RNN-Transducer (RNN-T) based
automatic speech recognition (ASR) model to improve the recognition of unseen words …

DANCER: Entity Description Augmented Named Entity Corrector for Automatic Speech Recognition

YC Wang, HW Wang, BC Yan, CH Lin… - arxiv preprint arxiv …, 2024 - arxiv.org
End-to-end automatic speech recognition (E2E ASR) systems often suffer from
mistranscription of domain-specific phrases, such as named entities, sometimes leading to …

MathSpeech: Leveraging Small LMs for Accurate Conversion in Mathematical Speech-to-Formula

S Hyeon, K Jung, J Won, NJ Kim, HG Ryu… - arxiv preprint arxiv …, 2024 - arxiv.org
In various academic and professional settings, such as mathematics lectures or research
presentations, it is often necessary to convey mathematical expressions orally. However …

Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition

Y Shu, B Hu, Y He, H Shi, L Wang, J Dang - arxiv preprint arxiv …, 2024 - arxiv.org
Accurately finding the wrong words in the automatic speech recognition (ASR) hypothesis
and recovering them well-founded is the goal of speech error correction. In this paper, we …