[HTML][HTML] Arabic natural language processing: An overview

I Guellil, H Saâdane, F Azouaou, B Gueni… - Journal of King Saud …, 2021 - Elsevier
Arabic is recognised as the 4th most used language of the Internet. Arabic has three main
varieties:(1) classical Arabic (CA),(2) Modern Standard Arabic (MSA),(3) Arabic Dialect (AD) …

Natural language processing for dialects of a language: A survey

A Joshi, R Dabre, D Kanojia, Z Li, H Zhan… - ACM Computing …, 2024 - dl.acm.org
State-of-the-art natural language processing (NLP) models are trained on massive training
corpora, and report a superlative performance on evaluation datasets. This survey delves …

Arabert: Transformer-based model for arabic language understanding

W Antoun, F Baly, H Hajj - arxiv preprint arxiv:2003.00104, 2020 - arxiv.org
The Arabic language is a morphologically rich language with relatively few resources and a
less explored syntax compared to English. Given these limitations, Arabic Natural Language …

Systematic literature review of dialectal Arabic: identification and detection

A Elnagar, SM Yagi, AB Nassif, I Shahin… - IEEE …, 2021 - ieeexplore.ieee.org
It is becoming increasingly difficult to know who is working on what and how in
computational studies of Dialectal Arabic. This study comes to chart the field by conducting a …

Improving Arabic text categorization using transformer training diversification

SA Chowdhury, A Abdelali, K Darwish… - Proceedings of the …, 2020 - aclanthology.org
Automatic categorization of short texts, such as news headlines and social media posts, has
many applications ranging from content analysis to recommendation systems. In this paper …

The impact of preprocessing on Arabic-English statistical and neural machine translation

M Oudah, A Almahairi, N Habash - arxiv preprint arxiv:1906.11751, 2019 - arxiv.org
Neural networks have become the state-of-the-art approach for machine translation (MT) in
many languages. While linguistically-motivated tokenization techniques were shown to have …

AdaSL: An unsupervised domain adaptation framework for Arabic multi-dialectal sequence labeling

A El Mekki, A El Mahdaouy, I Berrada… - Information Processing & …, 2022 - Elsevier
Dialectal Arabic (DA) refers to varieties of everyday spoken languages in the Arab world.
These dialects differ according to the country and region of the speaker, and their textual …

[PDF][PDF] Text mining techniques for sentiment analysis of Arabic dialects: Literature review

AA Al Shamsi, S Abdallah - Adv. Sci. Technol. Eng. Syst. J, 2021 - researchgate.net
Social media attracts a lot of users around the world. Many reasons drive people to use
social media sites such as expressing opinions and ideas, displaying their diaries and …

OMCD: Offensive Moroccan comments dataset

K Essefar, H Ait Baha, A El Mahdaouy… - Language Resources …, 2023 - Springer
Offensive content, such as verbal attacks, demeaning comments, or hate speech, has
become widespread on social media. Automatic detection of this content is considered an …

Utilizing character and word embeddings for text normalization with sequence-to-sequence models

D Watson, N Zalmout, N Habash - arxiv preprint arxiv:1809.01534, 2018 - arxiv.org
Text normalization is an important enabling technology for several NLP tasks. Recently,
neural-network-based approaches have outperformed well-established models in this task …