Natural language processing for dialects of a language: A survey

A Joshi, R Dabre, D Kanojia, Z Li, H Zhan… - ACM Computing …, 2024 - dl.acm.org
State-of-the-art natural language processing (NLP) models are trained on massive training
corpora, and report a superlative performance on evaluation datasets. This survey delves …

Camelira: An Arabic multi-dialect morphological disambiguator

O Obeid, G Inoue, N Habash - arxiv preprint arxiv:2211.16807, 2022 - arxiv.org
We present Camelira, a web-based Arabic multi-dialect morphological disambiguation tool
that covers four major variants of Arabic: Modern Standard Arabic, Egyptian, Gulf, and …

Morphotactic modeling in an open-source multi-dialectal Arabic morphological analyzer and generator

N Habash, R Marzouk, C Khairallah… - Proceedings of the 19th …, 2022 - aclanthology.org
Arabic is a morphologically rich and complex language, with numerous dialectal variants.
Previous efforts on Arabic morphology modeling focused on specific variants and specific …

The Najdi Arabic Corpus: a new corpus for an underrepresented Arabic dialect

R Alhedayani - Language Resources and Evaluation, 2024 - Springer
This paper presents a new corpus for a dialect of Arabic spoken in the central region of
Saudi Arabia: the Najdi Arabic Corpus. This is the first publicly available corpus for this …

Transformers on multilingual clause-level morphology

EC Acikgoz, T Chubakov, M Kural, GG Şahin… - arxiv preprint arxiv …, 2022 - arxiv.org
This paper describes our winning systems in MRL: The 1st Shared Task on Multilingual
Clause-level Morphology (EMNLP 2022 Workshop) designed by KUIS AI NLP team. We …

ALMA: Fast Lemmatizer and POS Tagger for Arabic

M Jarrar, D Akra, T Hammouda - Procedia Computer Science, 2024 - Elsevier
We introduce Alma (), an open-source and state-of-the-art lemmatizer, POS tagger, and root
tagger for Arabic, boasting both high speed and accuracy. Alma relies on a dictionary of …

Strategies for Arabic Readability Modeling

JP Liberato, B Alhafni, MA Khalil, N Habash - arxiv preprint arxiv …, 2024 - arxiv.org
Automatic readability assessment is relevant to building NLP applications for education,
content analysis, and accessibility. However, Arabic readability assessment is a challenging …

Computational Morphology and Lexicography Modeling of Modern Standard Arabic Nominals

C Khairallah, R Marzouk, S Khalifa, M Nassar… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern Standard Arabic (MSA) nominals present many morphological and lexical modeling
challenges that have not been consistently addressed previously. This paper attempts to …

Deep Active Learning for Morphophonological Processing

SM Mirbostani, Y Boreshban, S Khalifa… - Proceedings of the …, 2023 - aclanthology.org
Building a system for morphological processing is a challenging task in morphologically
complex languages like Arabic. Although there are some deep learning based models that …

Advancements in Arabic grammatical error detection and correction: An empirical investigation

B Alhafni, G Inoue, C Khairallah, N Habash - arxiv preprint arxiv …, 2023 - arxiv.org
Grammatical error correction (GEC) is a well-explored problem in English with many existing
models and datasets. However, research on GEC in morphologically rich languages has …