Homograph disambiguation through selective diacritic restoration

S Alqahtani, H Aldarmaki, M Diab - arxiv preprint arxiv:1912.04479, 2019‏ - arxiv.org
Lexical ambiguity, a challenging phenomenon in all natural languages, is particularly
prevalent for languages with diacritics that tend to be omitted in writing, such as Arabic …

Offensive Language Detection in Code-Mixed Bambara-French Corpus: Evaluating machine learning and deep learning classifiers

AK Diallo, K Abainia - 2023 International Conference on …, 2023‏ - ieeexplore.ieee.org
In this paper, we deal with offensive and abusive language detection on Bambara language,
which is an under-resourced language mainly spoken in Mali and some other African …

Investigating input and output units in diacritic restoration

S Alqahtani, M Diab - 2019 18th IEEE International Conference …, 2019‏ - ieeexplore.ieee.org
Diacritic restoration is the task of assigning diacritics (accents) for each character in a given
segment. The typical input levels that have been previously used in diacritic restoration …

[ספר][B] Full and partial diacritic restoration: Development and impact on downstream applications

S Alqahtani - 2020‏ - search.proquest.com
Languages that include diacritics in speech but omit diacritics in writing to a certain degree
result in written texts that are even more ambiguous than typically expected. Not including …

Cultural Survival Heritage of Bambara Language by Using NLP

O Daou, SS Mohanty - Applying AI-Based Tools and Technologies …, 2024‏ - Springer
Bambara, also known as Bamanankan, is a West African language primarily spoken in Mali
by the Bambara ethnic group. It is the most widely spoken language in Mali, with …

Syllable Frequencies in Manding: Examples from Periodicals in Bamana and Maninka

A Rovenchak, V Vydrin - Glottometrics, 2020‏ - shs.hal.science
We study the rank-frequency distribution of syllables in texts from written press in Bamana
and Guinean Maninka, two closely related languages from the Manding group of the Mande …

Texts for the corpus of Nko: collection, conversion, and open issues

A Rovenchak - Mandenkan. Bulletin semestriel d'études …, 2018‏ - journals.openedition.org
This paper discusses the compilation of a Maninka corpus, where the majority of texts are
written in the Nko alphabet. Prospects for further development of the Nko corpus are briefly …