Semisupervised Speech Data Extraction from Basque Parliament Sessions and Validation on Fully Bilingual Basque–Spanish ASR
In this paper, a semisupervised speech data extraction method is presented and applied to
create a new dataset designed for the development of fully bilingual Automatic Speech …
create a new dataset designed for the development of fully bilingual Automatic Speech …
Unsupervised domain adaptation for speech recognition with unsupervised error correction
L Mai, J Carson-Berndsen - arxiv preprint arxiv:2209.12043, 2022 - arxiv.org
The transcription quality of automatic speech recognition (ASR) systems degrades
significantly when transcribing audios coming from unseen domains. We propose an …
significantly when transcribing audios coming from unseen domains. We propose an …
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
We propose an unsupervised adaptation framework, Self-TAught Recognizer (STAR), which
leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) …
leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) …
[PDF][PDF] Semisupervised training of a fully bilingual ASR system for Basque and Spanish
Automatic speech recognition (ASR) of speech signals with code-switching (an abrupt
language change common in bilingual communities) typically requires spoken language …
language change common in bilingual communities) typically requires spoken language …
Overcoming domain mismatch in low resource sequence-to-sequence ASR models using hybrid generated pseudotranscripts
Sequence-to-sequence (seq2seq) models are competitive with hybrid models for automatic
speech recognition (ASR) tasks when large amounts of training data are available …
speech recognition (ASR) tasks when large amounts of training data are available …
Combining Unsupervised and Text Augmented Semi-Supervised Learning For Low Resourced Autoregressive Speech Recognition
CF Li, F Keith, W Hartmann… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Recent advances in unsupervised representation learning have demonstrated the impact of
pretraining on large amounts of read speech. We adapt these techniques for domain …
pretraining on large amounts of read speech. We adapt these techniques for domain …
Training Autoregressive Speech Recognition Models with Limited in-domain Supervision
Advances in self-supervised learning have significantly reduced the amount of transcribed
audio required for training. However, the majority of work in this area is focused on read …
audio required for training. However, the majority of work in this area is focused on read …
Domain Adaptation‐Based Self‐Supervised ASR Models for Low‐Resource Target Domain
L Ashok Kumar, D Karthika Renuka… - … and Translation for …, 2024 - Wiley Online Library
Domain adaptation is the concept of improving the performance of a model on a target
domain, by leveraging the knowledge gained from the source domain. Speech recognition …
domain, by leveraging the knowledge gained from the source domain. Speech recognition …
Enhancing the Performance of NMT Models Using the Data-Based Domain Adaptation Technique for Patent Translation
M Ahmed - 2023 - ir.lib.uwo.ca
During today's age of unparalleled connectivity, language and data have become powerful
tools capable of enabling effective communication and cross-cultural collaborations. Neural …
tools capable of enabling effective communication and cross-cultural collaborations. Neural …
[PDF][PDF] Speech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering
In real-world speech data processing, the scarcity of annotated data and the abundance of
unlabelled speech data present a significant challenge. To address this, we propose an …
unlabelled speech data present a significant challenge. To address this, we propose an …