Aya model: An instruction finetuned open-access multilingual language model

A Üstün, V Aryabumi, ZX Yong, WY Ko… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent breakthroughs in large language models (LLMs) have centered around a handful of
data-rich languages. What does it take to broaden access to breakthroughs beyond first …

Aya dataset: An open-access collection for multilingual instruction tuning

S Singh, F Vargus, D Dsouza, BF Karlsson… - arxiv preprint arxiv …, 2024 - arxiv.org
Datasets are foundational to many breakthroughs in modern artificial intelligence. Many
recent achievements in the space of natural language processing (NLP) can be attributed to …

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arxiv preprint arxiv …, 2024 - arxiv.org
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

Cvqa: Culturally-diverse multilingual visual question answering benchmark

D Romero, C Lyu, HA Wibowo, T Lynn, I Hamed… - arxiv preprint arxiv …, 2024 - arxiv.org
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used
to test the ability of vision-language models to understand and reason on knowledge …

Text Stemming and Lemmatization of Regional Languages in Indonesia: A Systematic Literature Review

Z Abidin, A Junaidi - Journal of Information Systems …, 2024 - e-journal.unair.ac.id
Background: Stemming is significantly essential in natural language processing (NLP) due
to the ability to minimize word variations to fundamental forms. This procedure facilitates the …

Preference tuning with human feedback on language, speech, and vision tasks: A survey

GI Winata, H Zhao, A Das, W Tang, DD Yao… - arxiv preprint arxiv …, 2024 - arxiv.org
Preference tuning is a crucial process for aligning deep generative models with human
preferences. This survey offers a thorough overview of recent advancements in preference …

A survey of multilingual large language models

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao, M Li… - Patterns, 2025 - cell.com
Multilingual large language models (MLLMs) leverage advanced large language models to
process and respond to queries across multiple languages, achieving significant success in …

Global mmlu: Understanding and addressing cultural and linguistic biases in multilingual evaluation

S Singh, A Romanou, C Fourrier, DI Adelani… - arxiv preprint arxiv …, 2024 - arxiv.org
Cultural biases in multilingual datasets pose significant challenges for their effectiveness as
global benchmarks. These biases stem not only from language but also from the cultural …

Include: Evaluating multilingual language understanding with regional knowledge

A Romanou, N Foroutan, A Sotnikova, Z Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
The performance differential of large language models (LLM) between languages hinders
their effective deployment in many regions, inhibiting the potential economic and societal …

Extraction and attribution of public figures statements for journalism in Indonesia using deep learning

YSP WP, YJ Kumar, NZ Zulkarnain, B Raza - Knowledge-Based Systems, 2024 - Elsevier
News articles are usually written by journalists based on statements taken from interviews
with public figures. Attribution from such statements provides important information and it …