Deep transfer learning for automatic speech recognition: Towards better generalization

H Kheddar, Y Himeur, S Al-Maadeed, A Amira… - Knowledge-Based …, 2023 - Elsevier
Automatic speech recognition (ASR) has recently become an important challenge when
using deep learning (DL). It requires large-scale training datasets and high computational …

Simple and effective zero-shot cross-lingual phoneme recognition

Q Xu, A Baevski, M Auli - arxiv preprint arxiv:2109.11680, 2021 - arxiv.org
Recent progress in self-training, self-supervised pretraining and unsupervised learning
enabled well performing speech recognition systems without any labeled data. However, in …

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition

S Feng, M Tu, R **a, C Huang, Y Wang - arxiv preprint arxiv:2305.11569, 2023 - arxiv.org
We improve low-resource ASR by integrating the ideas of multilingual training and self-
supervised learning. Concretely, we leverage an International Phonetic Alphabet (IPA) …

Integrated end-to-end multilingual method for low-resource agglutinative languages using Cyrillic scripts

A Bekarystankyzy, A Razaque… - Journal of Industrial …, 2025 - Elsevier
Millions of individuals across the world use automatic speech recognition (ASR) systems
every day to dictate messages, operate gadgets, begin searches, and enable data entry in …

Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings

MS Ribeiro, G Comini, J Lorenzo-Trueba - arxiv preprint arxiv:2307.16643, 2023 - arxiv.org
The Grapheme-to-Phoneme (G2P) task aims to convert orthographic input into a discrete
phonetic representation. G2P conversion is beneficial to various speech processing …

Speech Recognition Transformers: Topological-lingualism Perspective

S Singh, M Singh, V Kadyan - arxiv preprint arxiv:2408.14991, 2024 - arxiv.org
Transformers have evolved with great success in various artificial intelligence tasks. Thanks
to our recent prevalence of self-attention mechanisms, which capture long-term …

How do Phonological Properties Affect Bilingual Automatic Speech Recognition?

S Jain, A Yadavalli, GS Mirishkar… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Multilingual Automatic Speech Recognition (ASR) for Indian languages is an obvious
technique for leveraging their similarities. We present a detailed analysis of how …

UniGlyph: A Seven-Segment Script for Universal Language Representation

GV Sherin, AA Euphrine, A Moreen, LA Jose - arxiv preprint arxiv …, 2024 - arxiv.org
UniGlyph is a constructed language (conlang) designed to create a universal transliteration
system using a script derived from seven-segment characters. The goal of UniGlyph is to …

[PDF][PDF] Novel Rifle Number Recognition Based on Improved YOLO in Military Environment.

H Kwon, S Lee - Computers, Materials & Continua, 2024 - cdn.techscience.cn
Deep neural networks perform well in image recognition, object recognition, pattern
analysis, and speech recognition. In military applications, deep neural networks can detect …

Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling

K Manohar, R Rajan - EURASIP Journal on Audio, Speech, and Music …, 2023 - Springer
This article presents the research work on improving speech recognition systems for the
morphologically complex Malayalam language using subword tokens for language …