Small language models: Survey, measurements, and insights

Z Lu, X Li, D Cai, R Yi, F Liu, X Zhang, ND Lane… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Small language models (SLMs), despite their widespread adoption in modern smart
devices, have received significantly less academic attention compared to their large …

Emma-500: Enhancing massively multilingual adaptation of large language models

S Ji, Z Li, I Paul, J Paavola, P Lin, P Chen… - arxiv preprint arxiv …, 2024‏ - arxiv.org
In this work, we introduce EMMA-500, a large-scale multilingual language model continue-
trained on texts across 546 languages designed for enhanced multilingual performance …

FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in LLMs

Y Li, S Sun, P Liu - Proceedings of the 2024 Conference on …, 2024‏ - aclanthology.org
Fuzzy reasoning is vital due to the frequent use of imprecise information in daily contexts.
However, the ability of current large language models (LLMs) to handle such reasoning …

CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning

H Yu, X Wu, W Yin, D Zhang, S Hu - arxiv preprint arxiv:2410.02229, 2024‏ - arxiv.org
Large language models (LLMs) have made significant progress in natural language
understanding and generation, driven by scalable pretraining and advanced finetuning …

:Revealing the Decisive Effect of Instruction Diversity on Generalization

D Zhang, J Wang, F Charton - arxiv preprint arxiv:2410.04717, 2024‏ - arxiv.org
Understanding and accurately following instructions is critical for large language models
(LLMs) to be effective across diverse tasks. In this work, we rigorously examine the key …

Thinking with Knowledge Graphs: Enhancing LLM Reasoning Through Structured Data

X Wu, K Tsioutsiouliklis - arxiv preprint arxiv:2412.10654, 2024‏ - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural
language understanding and generation. However, they often struggle with complex …