[HTML][HTML] Arabic natural language processing: An overview

I Guellil, H Saâdane, F Azouaou, B Gueni… - Journal of King Saud …, 2021 - Elsevier
Arabic is recognised as the 4th most used language of the Internet. Arabic has three main
varieties:(1) classical Arabic (CA),(2) Modern Standard Arabic (MSA),(3) Arabic Dialect (AD) …

The interplay of variant, size, and task type in Arabic pre-trained language models

G Inoue, B Alhafni, N Baimukan, H Bouamor… - ar** review
A Ahmed, N Ali, M Alzubaidi, W Zaghouani… - Computer Methods and …, 2022 - Elsevier
Background Corpora play a vital role when training machine learning (ML) models and
building systems that use natural language processing (NLP). It can be challenging for …

A panoramic survey of natural language processing in the Arab world

K Darwish, N Habash, M Abbas, H Al-Khalifa… - Communications of the …, 2021 - dl.acm.org
THE TERM NATURAL language refers to any system of symbolic communication (spoken,
signed, or written) that has evolved naturally in humans without intentional human planning …

Nâbra: Syrian Arabic dialects with morphological annotations

A Nayouf, T Hammouda, M Jarrar, F Zaraket… - arxiv preprint arxiv …, 2023 - arxiv.org
This paper presents Nabra, a corpora of Syrian Arabic dialects with morphological
annotations. A team of Syrian natives collected more than 6K sentences containing about …

Curras+ baladi: Towards a levantine corpus

KE Haff, M Jarrar, T Hammouda, F Zaraket - arxiv preprint arxiv …, 2022 - arxiv.org
The processing of the Arabic language is a complex field of research. This is due to many
factors, including the complex and rich morphology of Arabic, its high degree of ambiguity …

Part-of-speech tagging for Arabic tweets using CRF and Bi-LSTM

W AlKhwiter, N Al-Twairesh - Computer Speech & Language, 2021 - Elsevier
Over the past few years, Twitter has experienced massive growth and the volume of its
online content has increased rapidly. This content has been a rich source for several studies …

Qabas: An Open-Source Arabic Lexicographic Database

M Jarrar, T Hammouda - arxiv preprint arxiv:2406.06598, 2024 - arxiv.org
We present Qabas, a novel open-source Arabic lexicon designed for NLP applications. The
novelty of Qabas lies in its synthesis of 110 lexicons. Specifically, Qabas lexical entries …

Lisan: Yemeni, iraqi, libyan, and sudanese arabic dialect corpora with morphological annotations

M Jarrar, FA Zaraket, T Hammouda… - 2023 20th ACS/IEEE …, 2023 - ieeexplore.ieee.org
This article presents morphologically-annotated Yemeni, Sudanese, Iraqi, and Libyan Arabic
dialects (L̂isān) corpora. L̂isān features around 1.2 million tokens. We collected the …

[PDF][PDF] Unified guidelines and resources for Arabic dialect orthography

N Habash, F Eryani, S Khalifa, O Rambow… - Proceedings of the …, 2018 - aclanthology.org
We present a unified set of guidelines and resources for conventional orthography of
dialectal Arabic. While Standard Arabic has well defined orthographic standards, none of the …