Multitask prompted training enables zero-shot task generalization V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ... arXiv preprint arXiv:2110.08207, 2021 | 1794 | 2021 |
Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1747 | 2023 |
Mixtral of experts AQ Jiang, A Sablayrolles, A Roux, A Mensch, B Savary, C Bamford, ... arXiv preprint arXiv:2401.04088, 2024 | 1291 | 2024 |
Mistral 7B AQ Jiang, A Sablayrolles, A Mensch, C Bamford, DS Chaplot, D Casas, ... arXiv preprint arXiv:2310.06825, 2023 | 1215 | 2023 |
Starcoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... arXiv preprint arXiv:2305.06161, 2023 | 798 | 2023 |
Crosslingual generalization through multitask finetuning N Muennighoff, T Wang, L Sutawika, A Roberts, S Biderman, TL Scao, ... arXiv preprint arXiv:2211.01786, 2022 | 683 | 2022 |
Obelics: An open web-scale filtered dataset of interleaved image-text documents H Laurençon, L Saulnier, L Tronchon, S Bekman, A Singh, A Lozhkov, ... Advances in Neural Information Processing Systems 36, 2024 | 250 | 2024 |
The bigscience roots corpus: A 1.6 tb composite multilingual dataset H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ... Advances in Neural Information Processing Systems 35, 31809-31826, 2022 | 188 | 2022 |
What language model architecture and pretraining objective works best for zero-shot generalization? T Wang, A Roberts, D Hesslow, T Le Scao, HW Chung, I Beltagy, ... International Conference on Machine Learning, 22964-22984, 2022 | 183 | 2022 |
What language model to train if you have one million gpu hours? TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ... arXiv preprint arXiv:2210.15424, 2022 | 114 | 2022 |
Mistral 7B (2023) AQ Jiang, A Sablayrolles, A Mensch, C Bamford, DS Chaplot, ... arXiv preprint arXiv:2310.06825, 2023 | 105 | 2023 |
Danish Contractor R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz …, 2023 | 98 | 2023 |
Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, and Harm de Vries R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... Starcoder: may the source be with you 3 (3.5), 1, 2023 | 86 | 2023 |
Fingpt: Large generative models for a small language R Luukkonen, V Komulainen, J Luoma, A Eskelinen, J Kanerva, ... arXiv preprint arXiv:2311.05640, 2023 | 32 | 2023 |
Operator learning with neural fields: Tackling pdes on general geometries L Serrano, L Le Boudec, A Kassaï Koupaï, TX Wang, Y Yin, JN Vittaut, ... Advances in Neural Information Processing Systems 36, 70581-70611, 2023 | 29 | 2023 |
Pixtral 12B P Agrawal, S Antoniak, EB Hanna, B Bout, D Chaplot, J Chudnovsky, ... arXiv preprint arXiv:2410.07073, 2024 | 24 | 2024 |
Multitask prompted training enables zero-shot task generalization. arXiv V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ... arXiv preprint arXiv:2110.08207, 2021 | 18 | 2021 |
The Use of Endoscopic Ultrasound Guided Fine Needle Biopsy for the Diagnosis of Microcystic Serous Cystic Neoplasms of the Pancreas K Garg, K Boupapanh, N Zilberstein, T Wang, G Kakked, R Al-Sabti, ... | | 2024 |
Handling unstructured data for operator learning using implicit neural representations TX Wang | | 2023 |
AutoBasisEncoder: Pre-trained Neural Field Basis via Autoencoding for Operator Learning TX Wang, N Baskiotis ICLR 2024 Workshop on AI4DifferentialEquations In Science, 0 | | |