Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Global mmlu: Understanding and addressing cultural and linguistic biases in multilingual evaluation
Cultural biases in multilingual datasets pose significant challenges for their effectiveness as
global benchmarks. These biases stem not only from language but also from the cultural …
global benchmarks. These biases stem not only from language but also from the cultural …
Include: Evaluating multilingual language understanding with regional knowledge
The performance differential of large language models (LLM) between languages hinders
their effective deployment in many regions, inhibiting the potential economic and societal …
their effective deployment in many regions, inhibiting the potential economic and societal …
MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
English-centric large language models (LLMs) often show strong multilingual capabilities.
However, the multilingual performance of these models remains unclear and is not …
However, the multilingual performance of these models remains unclear and is not …
AfriInstruct: Instruction Tuning of African Languages for Diverse Tasks
K Uemura, M Chen, A Pejovic… - Findings of the …, 2024 - aclanthology.org
Large language models (LLMs) for African languages perform worse compared to their
performance in high-resource languages. To address this issue, we introduce AfriInstruct …
performance in high-resource languages. To address this issue, we introduce AfriInstruct …
The Roles of English in Evaluating Multilingual Language Models
Multilingual natural language processing is getting increased attention, with numerous
models, benchmarks, and methods being released for many languages. English is often …
models, benchmarks, and methods being released for many languages. English is often …
IberoBench: A Benchmark for LLM Evaluation in Iberian Languages
I Baucells, J Aula-Blasco, I de-Dios-Flores… - Proceedings of the …, 2025 - aclanthology.org
The current best practice to measure the performance of base Large Language Models is to
establish a multi-task benchmark that covers a range of capabilities of interest. Currently …
establish a multi-task benchmark that covers a range of capabilities of interest. Currently …
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages
Evaluations of Large Language Models (LLMs) on knowledge-intensive tasks and factual
accuracy often focus on high-resource languages primarily because datasets for low …
accuracy often focus on high-resource languages primarily because datasets for low …
Automatically Generating IsiZulu Words From Indo-Arabic Numerals
Artificial conversational agents are deployed to assist humans in a variety of tasks. Some of
these tasks require the capability to communicate numbers as part of their internal and …
these tasks require the capability to communicate numbers as part of their internal and …
Large Language Models Compression via Low-Rank Feature Distillation
Current LLM structured pruning methods involve two steps:(1) compressing with calibration
data and (2) continued pretraining on billions of tokens to recover the lost performance. This …
data and (2) continued pretraining on billions of tokens to recover the lost performance. This …
[PDF][PDF] Scaling Pre-training Data and Language Models for African Languages
A Oladipo - 2024 - uwspace.uwaterloo.ca
Recent advancements in language models, particularly for high-resource languages, have
not been paralleled in low-resource languages spoken across Africa. This thesis addresses …
not been paralleled in low-resource languages spoken across Africa. This thesis addresses …