Acoustic modeling based on deep learning for low-resource speech recognition: An overview

C Yu, M Kang, Y Chen, J Wu, X Zhao - IEEE Access, 2020 - ieeexplore.ieee.org
The polarization of world languages is becoming more and more obvious. Many languages,
mainly endangered languages, are of low-resource attribute due to lack of information. Both …

Asr error correction and domain adaptation using machine translation

A Mani, S Palaskar, NV Meripo… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an increasingly
viable service for companies of any size building speech-based products. While these ASR …

[PDF][PDF] Ensembling End-to-End Deep Models for Computational Paralinguistics Tasks: ComParE 2020 Mask and Breathing Sub-Challenges.

M Markitantov, D Dresvyanskiy, D Mamontov… - …, 2020 - isca-archive.org
This paper describes deep learning approaches for the Mask and Breathing Sub-
Challenges (SCs), which are addressed by the INTERSPEECH 2020 Computational …

Towards understanding ASR error correction for medical conversations

A Mani, S Palaskar, S Konam - … of the first workshop on natural …, 2020 - aclanthology.org
Abstract Domain Adaptation for Automatic Speech Recognition (ASR) error correction via
machine translation is a useful technique for improving out-of-domain outputs of pre-trained …

Exploring phoneme-level speech representations for end-to-end speech translation

E Salesky, M Sperber, AW Black - arxiv preprint arxiv:1906.01199, 2019 - arxiv.org
Previous work on end-to-end translation from speech has primarily used frame-level
features as speech representations, which creates longer, sparser sequences than text. We …

Conversations in the wild: Data collection, automatic generation and evaluation

N Zaheer, AA Raza, M Shabbir - Computer Speech & Language, 2025 - Elsevier
The aim of conversational speech processing is to analyze human conversations in natural
settings. It finds numerous applications in personality traits identification, speech therapy …

Multilingual speech recognition with corpus relatedness sampling

X Li, S Dalmia, AW Black, F Metze - arxiv preprint arxiv:1908.01060, 2019 - arxiv.org
Multilingual acoustic models have been successfully applied to low-resource speech
recognition. Most existing works have combined many small corpora together and …

Transfer learning in speaker's age and gender recognition

M Markitantov - Speech and Computer: 22nd International Conference …, 2020 - Springer
In this paper, we study an application of transfer learning approach to speaker's age and
gender recognition task. Recently, speech analysis systems, which take images of log Mel …

[BUCH][B] Computational tools for endangered language documentation

A Anastasopoulos - 2019 - search.proquest.com
COMPUTATIONAL TOOLS FOR ENDANGERED LANGUAGE DOCUMENTATION A
Dissertation Submitted to the Graduate School of the University of N Page 1 …

Reducing spelling inconsistencies in code-switching ASR using contextualized CTC loss

B Naowarat, T Kongthaworn… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Code-Switching (CS) remains a challenge for Automatic Speech Recognition (ASR),
especially character-based models. With the combined choice of characters from multiple …