- Academic Search

S Longpre, N Singh, M Cherep, K Tiwary… - arxiv preprint arxiv …, 2024 - arxiv.org

Progress in AI is driven largely by the scale and quality of training data. Despite this, there is
a deficit of empirical analysis examining the attributes of well-established datasets beyond …

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

BenCzechMark: A Czech-centric Multitask and Multimetric Benchmark for Large Language Models with Duel Scoring Mechanism

M Fajcik, M Docekal, J Dolezal, K Ondrej… - arxiv preprint arxiv …, 2024 - arxiv.org

We present BenCzechMark (BCM), the first comprehensive Czech language benchmark
designed for large language models, offering diverse tasks, multiple task formats, and …

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bridging the Gap: Enhancing LLM Performance for Low-Resource African Languages with New Benchmarks, Fine-Tuning, and Cultural Adjustments

T Alhanai, A Kasumovic, M Ghassemi… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have shown remarkable performance across various tasks,
yet significant disparities remain for non-English languages, and especially native African …

Create alert

Cite

Advanced search

Saved to My library

Irokobench: A new benchmark for african languages in the age of large language models

Bridging the Data Provenance Gap Across Text, Speech and Video

BenCzechMark: A Czech-centric Multitask and Multimetric Benchmark for Large Language Models with Duel Scoring Mechanism

Bridging the Gap: Enhancing LLM Performance for Low-Resource African Languages with New Benchmarks, Fine-Tuning, and Cultural Adjustments