[HTML][HTML] Arabic natural language processing: An overview
Arabic is recognised as the 4th most used language of the Internet. Arabic has three main
varieties:(1) classical Arabic (CA),(2) Modern Standard Arabic (MSA),(3) Arabic Dialect (AD) …
varieties:(1) classical Arabic (CA),(2) Modern Standard Arabic (MSA),(3) Arabic Dialect (AD) …
Natural language processing for dialects of a language: A survey
State-of-the-art natural language processing (NLP) models are trained on massive training
corpora, and report a superlative performance on evaluation datasets. This survey delves …
corpora, and report a superlative performance on evaluation datasets. This survey delves …
Arabert: Transformer-based model for arabic language understanding
The Arabic language is a morphologically rich language with relatively few resources and a
less explored syntax compared to English. Given these limitations, Arabic Natural Language …
less explored syntax compared to English. Given these limitations, Arabic Natural Language …
Systematic literature review of dialectal Arabic: identification and detection
It is becoming increasingly difficult to know who is working on what and how in
computational studies of Dialectal Arabic. This study comes to chart the field by conducting a …
computational studies of Dialectal Arabic. This study comes to chart the field by conducting a …
Improving Arabic text categorization using transformer training diversification
Automatic categorization of short texts, such as news headlines and social media posts, has
many applications ranging from content analysis to recommendation systems. In this paper …
many applications ranging from content analysis to recommendation systems. In this paper …
The impact of preprocessing on Arabic-English statistical and neural machine translation
Neural networks have become the state-of-the-art approach for machine translation (MT) in
many languages. While linguistically-motivated tokenization techniques were shown to have …
many languages. While linguistically-motivated tokenization techniques were shown to have …
AdaSL: An unsupervised domain adaptation framework for Arabic multi-dialectal sequence labeling
Dialectal Arabic (DA) refers to varieties of everyday spoken languages in the Arab world.
These dialects differ according to the country and region of the speaker, and their textual …
These dialects differ according to the country and region of the speaker, and their textual …
[PDF][PDF] Text mining techniques for sentiment analysis of Arabic dialects: Literature review
Social media attracts a lot of users around the world. Many reasons drive people to use
social media sites such as expressing opinions and ideas, displaying their diaries and …
social media sites such as expressing opinions and ideas, displaying their diaries and …
OMCD: Offensive Moroccan comments dataset
Offensive content, such as verbal attacks, demeaning comments, or hate speech, has
become widespread on social media. Automatic detection of this content is considered an …
become widespread on social media. Automatic detection of this content is considered an …
Utilizing character and word embeddings for text normalization with sequence-to-sequence models
Text normalization is an important enabling technology for several NLP tasks. Recently,
neural-network-based approaches have outperformed well-established models in this task …
neural-network-based approaches have outperformed well-established models in this task …