Multitask prompted training enables zero-shot task generalization V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ... arXiv preprint arXiv:2110.08207, 2021 | 1791 | 2021 |
Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1768 | 2023 |
Mixtral of experts AQ Jiang, A Sablayrolles, A Roux, A Mensch, B Savary, C Bamford, ... arXiv preprint arXiv:2401.04088, 2024 | 1413 | 2024 |
Starcoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... arXiv preprint arXiv:2305.06161, 2023 | 823 | 2023 |
Crosslingual generalization through multitask finetuning N Muennighoff, T Wang, L Sutawika, A Roberts, S Biderman, TL Scao, ... arXiv preprint arXiv:2211.01786, 2022 | 709 | 2022 |
Obelics: An open web-scale filtered dataset of interleaved image-text documents H Laurençon, L Saulnier, L Tronchon, S Bekman, A Singh, A Lozhkov, ... Advances in Neural Information Processing Systems 36, 71683-71702, 2023 | 258 | 2023 |
Danish Contractor R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz …, 2023 | 189 | 2023 |
The bigscience roots corpus: A 1.6 tb composite multilingual dataset H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ... Advances in Neural Information Processing Systems 35, 31809-31826, 2022 | 189 | 2022 |
What language model architecture and pretraining objective works best for zero-shot generalization? T Wang, A Roberts, D Hesslow, T Le Scao, HW Chung, I Beltagy, ... International Conference on Machine Learning, 22964-22984, 2022 | 183 | 2022 |
What language model to train if you have one million GPU hours? TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ... arXiv preprint arXiv:2210.15424, 2022 | 119 | 2022 |
Mistral 7B. arXiv 2023 AQ Jiang, A Sablayrolles, A Mensch, C Bamford, DS Chaplot, D Casas, ... arXiv preprint arXiv:2310.06825, 2024 | 52 | 2024 |
Pixtral 12B P Agrawal, S Antoniak, EB Hanna, B Bout, D Chaplot, J Chudnovsky, ... arXiv preprint arXiv:2410.07073, 2024 | 36 | 2024 |
FinGPT: Large generative models for a small language R Luukkonen, V Komulainen, J Luoma, A Eskelinen, J Kanerva, ... arXiv preprint arXiv:2311.05640, 2023 | 36 | 2023 |
Operator learning with neural fields: Tackling pdes on general geometries L Serrano, L Le Boudec, A Kassaï Koupaï, TX Wang, Y Yin, JN Vittaut, ... Advances in Neural Information Processing Systems 36, 70581-70611, 2023 | 35 | 2023 |
Multitask prompted training enables zero-shot task generalization. arXiv V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ... arXiv preprint arXiv:2110.08207, 2021 | 20 | 2021 |
The Use of Endoscopic Ultrasound Guided Fine Needle Biopsy for the Diagnosis of Microcystic Serous Cystic Neoplasms of the Pancreas K Garg, K Boupapanh, N Zilberstein, T Wang, G Kakked, R Al-Sabti, ... | | 2024 |
Handling unstructured data for operator learning using implicit neural representations TX Wang | | 2023 |
AutoBasisEncoder: Pre-trained Neural Field Basis via Autoencoding for Operator Learning TX Wang, N Baskiotis ICLR 2024 Workshop on AI4DifferentialEquations In Science, 0 | | |