Spellmapper: A non-autoregressive neural spellchecker for asr customization with candidate retrieval based on n-gram map**s
A Antonova, E Bakhturina, B Ginsburg - arxiv preprint arxiv:2306.02317, 2023 - arxiv.org
Contextual spelling correction models are an alternative to shallow fusion to improve
automatic speech recognition (ASR) quality given user vocabulary. To deal with large user …
automatic speech recognition (ASR) quality given user vocabulary. To deal with large user …
[PDF][PDF] NeMo Forced Aligner and its application to word alignment for subtitle generation
Abstract We present NeMo Forced Aligner (NFA): an efficient and accurate forced aligner
which is part of the NeMo conversational AI open-source toolkit. NFA can produce token …
which is part of the NeMo conversational AI open-source toolkit. NFA can produce token …
Revisiting automatic speech recognition for tamil and hindi connected number recognition
R Mishra, SRG Boopathy, M Ravikiran… - Proceedings of the …, 2023 - aclanthology.org
Abstract Automatic Speech Recognition and its applications are rising in popularity across
applications with reasonable inference results. Recent state-of-the-art approaches, often …
applications with reasonable inference results. Recent state-of-the-art approaches, often …
Building and curating conversational corpora for diversity-aware language science and technology
We present an analysis pipeline and best practice guidelines for building and curating
corpora of everyday conversation in diverse languages. Surveying language documentation …
corpora of everyday conversation in diverse languages. Surveying language documentation …
Everyday conversations: a comparative study of expert transcriptions and ASR outputs at a lexical level
The study examines the outcomes of automatic speech recognition (ASR) applied to field
recordings of daily Russian speech. Everyday conversations, captured in real-life …
recordings of daily Russian speech. Everyday conversations, captured in real-life …
Automatic Time Alignment Generation For End-to-End ASR Using Acoustic Probability Modelling
End-to-end trainable (E2E) automatic speech recognition (ASR) models can achieve low
error rates, but unlike hidden Markov model (HMM)-based systems they cannot naturally …
error rates, but unlike hidden Markov model (HMM)-based systems they cannot naturally …
[HTML][HTML] Methodology for Obtaining High-Quality Speech Corpora
A Wieczorkowska - Applied Sciences, 2025 - mdpi.com
Speech-based communication between users and machines is a very lively branch of
research that covers speech recognition, synthesis, and, generally, natural language …
research that covers speech recognition, synthesis, and, generally, natural language …
An analysis of large speech models-based representations for speech emotion recognition
AB Stânea, V Strilețchi, C Strilețchi… - … Conference on Speech …, 2023 - ieeexplore.ieee.org
Large speech models-derived features have recently shown increased performance over
signal-based features across multiple downstream tasks, even when the networks are not …
signal-based features across multiple downstream tasks, even when the networks are not …
Validation of Speech Data for Training Automatic Speech Recognition Systems
Recent automatic speech recognition systems are largely based on deep neural networks
that need large amounts of labelled speech data to train. This can be a problem, especially …
that need large amounts of labelled speech data to train. This can be a problem, especially …