Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1758 | 2023 |
Starcoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... arXiv preprint arXiv:2305.06161, 2023 | 834 | 2023 |
The stack: 3 tb of permissively licensed source code D Kocetkov, R Li, LB Allal, J Li, C Mou, CM Ferrandis, Y Jernite, M Mitchell, ... arXiv preprint arXiv:2211.15533, 2022 | 284 | 2022 |
SantaCoder: don't reach for the stars! LB Allal, R Li, D Kocetkov, C Mou, C Akiki, CM Ferrandis, N Muennighoff, ... arXiv preprint arXiv:2301.03988, 2023 | 243* | 2023 |
Starcoder 2 and the stack v2: The next generation A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ... arXiv preprint arXiv:2402.19173, 2024 | 209 | 2024 |
The bigscience roots corpus: A 1.6 tb composite multilingual dataset H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ... Advances in Neural Information Processing Systems 35, 31809-31826, 2022 | 189 | 2022 |
The fineweb datasets: Decanting the web for the finest text data at scale G Penedo, H Kydlíček, A Lozhkov, M Mitchell, C Raffel, L Von Werra, ... The Thirty-eight Conference on Neural Information Processing Systems …, 2024 | 61 | 2024 |
A framework for the evaluation of code generation models LB Allal, N Muennighoff, LK Umapathi, B Lipkin, L Von Werra A framework for the evaluation of code generation models, 2022 | 54 | 2022 |
Scaling laws and compute-optimal training beyond fixed training durations A Hägele, E Bakouch, A Kosson, LB Allal, L Von Werra, M Jaggi arXiv preprint arXiv:2405.18392, 2024 | 23 | 2024 |
Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean M R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, and Harm de Vries …, 2023 | 15 | 2023 |
SmolLM2: When Smol Goes Big--Data-Centric Training of a Small Language Model LB Allal, A Lozhkov, E Bakouch, GM Blázquez, G Penedo, L Tunstall, ... arXiv preprint arXiv:2502.02737, 2025 | 3 | 2025 |
The bigcode project governance card S Hughes, H de Vries, J Robinson, CM Ferrandis, LB Allal, L von Werra, ... arXiv preprint arXiv:2312.03872, 2023 | 2 | 2023 |
Hawkes point processes based inference applied to seismic data analysis LB Allal, A Lejay, RS Stoica 2020 RING MEETING, 2020 | 1 | 2020 |