Bengali common voice speech dataset for automatic speech recognition

S Alam, A Sushmit, Z Abdullah, S Nakkhatra… - arxiv preprint arxiv …, 2022 - arxiv.org
Bengali is one of the most spoken languages in the world with over 300 million speakers
globally. Despite its popularity, research into the development of Bengali speech recognition …

Context-aware transliteration of romanized south asian languages

C Kirov, C Johny, A Katanova, A Gutkin… - Computational …, 2024 - direct.mit.edu
While most transliteration research is focused on single tokens such as named entities—for
example, transliteration of from the Gujarati script to the Latin script “Ahmedabad” …

Ood-speech: A large bengali speech recognition dataset for out-of-distribution benchmarking

FR Rakib, SS Dip, S Alam, N Tasnim… - arxiv preprint arxiv …, 2023 - arxiv.org
We present OOD-Speech, the first out-of-distribution (OOD) benchmarking dataset for
Bengali automatic speech recognition (ASR). Being one of the most spoken languages …

Jambu: A historical linguistic database for South Asian languages

A Arora, A Farris, S Basu, S Kolichala - arxiv preprint arxiv:2306.02514, 2023 - arxiv.org
We introduce Jambu, a cognate database of South Asian languages which unifies dozens of
previous sources in a structured and accessible format. The database includes 287k …

Supervised grapheme-to-phoneme conversion of orthographic schwas in Hindi and Punjabi

A Arora, L Gessler, N Schneider - arxiv preprint arxiv:2004.10353, 2020 - arxiv.org
Hindi grapheme-to-phoneme (G2P) conversion is mostly trivial, with one exception: whether
a schwa represented in the orthography is pronounced or unpronounced (deleted). Previous …

[PDF][PDF] Cross-Lingual Consistency of Phonological Features: An Empirical Study.

C Johny, A Gutkin, M Jansche - INTERSPEECH, 2019 - researchgate.net
The concept of a phoneme arose historically as a theoretical abstraction that applies
language-internally. Using phonemes and phonological features in cross-linguistic settings …

[PDF][PDF] A character gram modeling approach towards Bengali Speech to Text with Regional Dialects

MR Hassan - 2023 - researchgate.net
The Bengali language, spoken in various regions of south-Asia and also among the Bengali
diaspora, exhibits rich diversity with regional dialects or variations that reflect the cultural …