Dialect-to-standard normalization: A large-scale multilingual evaluation

O Kuparinen, AM Haddad… - Conference on Empirical …, 2023 - researchportal.helsinki.fi
Text normalization methods have been commonly applied to historical language or user-
generated content, but less often to dialectal transcriptions. In this paper, we introduce …

Natural language processing for similar languages, varieties, and dialects: A survey

M Zampieri, P Nakov, Y Scherrer - Natural Language Engineering, 2020 - cambridge.org
There has been a lot of recent interest in the natural language processing (NLP) community
in the computational processing of language varieties and dialects, with the aim to improve …

DIALECTBENCH: A NLP Benchmark for Dialects, Varieties, and Closely-Related Languages

F Faisal, O Ahia, A Srivastava, K Ahuja… - arxiv preprint arxiv …, 2024 - arxiv.org
Language technologies should be judged on their usefulness in real-world use cases. An
often overlooked aspect in natural language processing (NLP) research and evaluation is …

The Use of ASR-Equipped Software in the Teaching of Suprasegmental Features of Pronunciation: A Critical Review.

T Kochem, J Beck, E Goodale - CALICO Journal, 2022 - search.ebscohost.com
Technology has paved the way for new modalities in language learning, teaching, and
assessment. However, there is still a great deal of work to be done to develop such tools for …

Representing variation in a spoken corpus of an endangered dialect: the case of Torlak

T Vuković - Language Resources and Evaluation, 2021 - Springer
The paper presents a spoken corpus of the endangered Torlak dialect from the Timok area
of Southeast Serbia. This dialect expresses a great deal of variation in the use of non …

Swissdial: Parallel multidialectal corpus of spoken swiss german

P Dogan-Schönberger, J Mäder, T Hofmann - arxiv preprint arxiv …, 2021 - arxiv.org
Swiss German is a dialect continuum whose natively acquired dialects significantly differ
from the formal variety of the language. These dialects are mostly used for verbal …

Character alignment methods for dialect-to-standard normalization

Y Scherrer - Proceedings of the 20th SIGMORPHON workshop on …, 2023 - aclanthology.org
This paper evaluates various character alignment methods on the task of sentence-level
standardization of dialect transcriptions. We compare alignment methods from different …

Exploring the possibilities of Thomson's fourth paradigm transformation—The case for a multimodal approach to digital oral history?

HK Smyth, J Nyhan, A Flinn - Digital Scholarship in the …, 2023 - academic.oup.com
This article seeks to reorientate 'digital oral history'towards a new research paradigm,
Multimodal Digital Oral History (MDOH), and in so doing it seeks to build upon Alistair …

Unsupervised deep language and dialect identification for short texts

K Goswami, R Sarkar, BR Chakravarthi… - Proceedings of the …, 2020 - aclanthology.org
Abstract Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of
closely related languages or dialects, is one of the primary steps in many natural language …

ASR for Non-standardised Languages with Dialectal Variation: the case of Swiss German

I Nigmatulina, T Kew, T Samardzic - … of the 7th Workshop on NLP …, 2020 - aclanthology.org
Strong regional variation, together with the lack of standard orthography, makes Swiss
German automatic speech recognition (ASR) particularly difficult in a multi-dialectal setting …