Sabiá: Portuguese large language models

R Pires, H Abonizio, TS Almeida… - Brazilian Conference on …, 2023‏ - Springer
As the capabilities of language models continue to advance, it is conceivable that “one-size-
fits-all” model will remain as the main paradigm. For instance, given the vast number of …

Prometheus 2: An open source language model specialized in evaluating other language models

S Kim, J Suk, S Longpre, BY Lin, J Shin… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from
various LMs. However, concerns including transparency, controllability, and affordability …

Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration

S Feng, W Shi, Y Wang, W Ding… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps-
-missing or outdated information in LLMs--might always persist given the evolving nature of …

MoDE: CLIP Data Experts via Clustering

J Ma, PY Huang, S **e, SW Li… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
The success of contrastive language-image pretraining (CLIP) relies on the supervision from
the pairing between images and captions which tends to be noisy in web-crawled data. We …

Qurating: Selecting high-quality data for training language models

A Wettig, A Gupta, S Malik, D Chen - arxiv preprint arxiv:2402.09739, 2024‏ - arxiv.org
Selecting high-quality pre-training data is important for creating capable language models,
but existing methods rely on simple heuristics. We introduce QuRating, a method for …

AboutMe: Using self-descriptions in webpages to document the effects of english pretraining data filters

L Lucy, S Gururangan, L Soldaini, E Strubell… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Large language models'(LLMs) abilities are drawn from their pretraining data, and model
development begins with data curation. However, decisions around what data is retained or …

Tiny Models are the Computational Saver for Large Models

Q Wang, B Cardiff, A Frappé, B Larras… - European Conference on …, 2024‏ - Springer
This paper introduces TinySaver, an early-exit-like dynamic model compression approach
which employs tiny models to substitute large models adaptively. Distinct from traditional …

An introduction to vision-language modeling

F Bordes, RY Pang, A Ajay, AC Li, A Bardes… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Following the recent popularity of Large Language Models (LLMs), several attempts have
been made to extend them to the visual domain. From having a visual assistant that could …

Pedagogical Alignment of Large Language Models (LLM) for Personalized Learning: A Survey, Trends and Challenges

MA Razafinirina, WG Dimbisoa, T Mahatody - Journal of Intelligent …, 2024‏ - scirp.org
This survey paper investigates how personalized learning offered by Large Language
Models (LLMs) could transform educational experiences. We explore Knowledge Editing …

Lemoe: Advanced mixture of experts adaptor for lifelong model editing of large language models

R Wang, P Li - arxiv preprint arxiv:2406.20030, 2024‏ - arxiv.org
Large language models (LLMs) require continual knowledge updates to stay abreast of the
ever-changing world facts, prompting the formulation of lifelong model editing task. While …