Natural language processing for dialects of a language: A survey
State-of-the-art natural language processing (NLP) models are trained on massive training
corpora, and report a superlative performance on evaluation datasets. This survey delves …
corpora, and report a superlative performance on evaluation datasets. This survey delves …
Jais and jais-chat: Arabic-centric foundation and instruction-tuned open generative large language models
We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and
instruction-tuned open generative large language models (LLMs). The models are based on …
instruction-tuned open generative large language models (LLMs). The models are based on …
Having beer after prayer? measuring cultural bias in large language models
As the reach of large language models (LMs) expands globally, their ability to cater to
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
diverse cultural contexts becomes crucial. Despite advancements in multilingual …
AraT5: Text-to-text transformers for Arabic language generation
Transfer learning with a unified Transformer framework (T5) that converts all language
problems into a text-to-text format was recently proposed as a simple and effective transfer …
problems into a text-to-text format was recently proposed as a simple and effective transfer …
AraFinNlp 2024: The first arabic financial nlp shared task
The expanding financial markets of the Arab world require sophisticated Arabic NLP tools.
To address this need within the banking domain, the Arabic Financial NLP (AraFinNLP) …
To address this need within the banking domain, the Arabic Financial NLP (AraFinNLP) …
WojoodNER 2023: The First Arabic Named Entity Recognition Shared Task
We present WojoodNER-2023, the first Arabic Named Entity Recognition (NER) Shared
Task. The primary focus of WojoodNER-2023 is on Arabic NER, offering novel NER datasets …
Task. The primary focus of WojoodNER-2023 is on Arabic NER, offering novel NER datasets …
ACOM: Arabic Comparative Opinion Mining in Social Media Utilizing Word Embedding, Deep Learning Model & LLM-GPT
Reliance on social networks has become an integral part of modern daily activities. Social
networks are crowded with vast numbers of comments, opinions, and beliefs about different …
networks are crowded with vast numbers of comments, opinions, and beliefs about different …
Arabart: a pretrained arabic sequence-to-sequence model for abstractive summarization
Like most natural language understanding and generation tasks, state-of-the-art models for
summarization are transformer-based sequence-to-sequence architectures that are …
summarization are transformer-based sequence-to-sequence architectures that are …
Dziribert: a pre-trained language model for the algerian dialect
Pre-trained transformers are now the de facto models in Natural Language Processing given
their state-of-the-art results in many tasks and languages. However, most of the current …
their state-of-the-art results in many tasks and languages. However, most of the current …
A benchmark for evaluating Arabic contextualized word embedding models
Word embeddings, which represent words as numerical vectors in a high-dimensional
space, are contextualized by generating a unique vector representation for each sense of a …
space, are contextualized by generating a unique vector representation for each sense of a …