Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1782 | 2023 |
MTEB: Massive text embedding benchmark N Muennighoff, N Tazi, L Magne, N Reimers EACL 2023, 2022 | 660 | 2022 |
Scaling Data-Constrained Language Models N Muennighoff, AM Rush, B Barak, TL Scao, A Piktus, N Tazi, S Pyysalo, ... NeurIPS 2023, 2023 | 243 | 2023 |
Starcoder 2 and the stack v2: The next generation A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ... arXiv preprint arXiv:2402.19173, 2024 | 213 | 2024 |
FinGPT: Large generative models for a small language R Luukkonen, V Komulainen, J Luoma, A Eskelinen, J Kanerva, ... arXiv preprint arXiv:2311.05640, 2023 | 36 | 2023 |
Masader plus: A new Interface for exploring+ 500 Arabic NLP datasets Y Altaher, A Fadel, M Alotaibi, M Alyazidi, M Al-Mutairi, M Aldhbuiub, ... arXiv preprint arXiv:2208.00932, 2022 | 2 | 2022 |
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model V Danchev, V Nikoulina, V Laippala, V Lepercq, V Prabhu, Z Alyafeai, ... | | 2023 |
No Village Left Behind: A Moroccan Data-driven Platform for Effective Aid Coordination A Bounhar¹, A Anouzla, A Lekssays, A Zizaan⁴, B Chourane, ... | | |