دنبال کردن
Thomas Wang
Thomas Wang
Research Engineer, Mistral.ai
ایمیل تأیید شده در mistral.ai
عنوان
نقل شده توسط
نقل شده توسط
سال
Multitask prompted training enables zero-shot task generalization
V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ...
arXiv preprint arXiv:2110.08207, 2021
17912021
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
17682023
Mixtral of experts
AQ Jiang, A Sablayrolles, A Roux, A Mensch, B Savary, C Bamford, ...
arXiv preprint arXiv:2401.04088, 2024
14132024
Starcoder: may the source be with you!
R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ...
arXiv preprint arXiv:2305.06161, 2023
8232023
Crosslingual generalization through multitask finetuning
N Muennighoff, T Wang, L Sutawika, A Roberts, S Biderman, TL Scao, ...
arXiv preprint arXiv:2211.01786, 2022
7092022
Obelics: An open web-scale filtered dataset of interleaved image-text documents
H Laurençon, L Saulnier, L Tronchon, S Bekman, A Singh, A Lozhkov, ...
Advances in Neural Information Processing Systems 36, 71683-71702, 2023
2582023
Danish Contractor
R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ...
Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz …, 2023
1892023
The bigscience roots corpus: A 1.6 tb composite multilingual dataset
H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ...
Advances in Neural Information Processing Systems 35, 31809-31826, 2022
1892022
What language model architecture and pretraining objective works best for zero-shot generalization?
T Wang, A Roberts, D Hesslow, T Le Scao, HW Chung, I Beltagy, ...
International Conference on Machine Learning, 22964-22984, 2022
1832022
What language model to train if you have one million GPU hours?
TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ...
arXiv preprint arXiv:2210.15424, 2022
1192022
Mistral 7B. arXiv 2023
AQ Jiang, A Sablayrolles, A Mensch, C Bamford, DS Chaplot, D Casas, ...
arXiv preprint arXiv:2310.06825, 2024
522024
Pixtral 12B
P Agrawal, S Antoniak, EB Hanna, B Bout, D Chaplot, J Chudnovsky, ...
arXiv preprint arXiv:2410.07073, 2024
362024
FinGPT: Large generative models for a small language
R Luukkonen, V Komulainen, J Luoma, A Eskelinen, J Kanerva, ...
arXiv preprint arXiv:2311.05640, 2023
362023
Operator learning with neural fields: Tackling pdes on general geometries
L Serrano, L Le Boudec, A Kassaï Koupaï, TX Wang, Y Yin, JN Vittaut, ...
Advances in Neural Information Processing Systems 36, 70581-70611, 2023
352023
Multitask prompted training enables zero-shot task generalization. arXiv
V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ...
arXiv preprint arXiv:2110.08207, 2021
202021
The Use of Endoscopic Ultrasound Guided Fine Needle Biopsy for the Diagnosis of Microcystic Serous Cystic Neoplasms of the Pancreas
K Garg, K Boupapanh, N Zilberstein, T Wang, G Kakked, R Al-Sabti, ...
2024
Handling unstructured data for operator learning using implicit neural representations
TX Wang
2023
AutoBasisEncoder: Pre-trained Neural Field Basis via Autoencoding for Operator Learning
TX Wang, N Baskiotis
ICLR 2024 Workshop on AI4DifferentialEquations In Science, 0
سیستم در حال حاضر قادر به انجام عملکرد نیست. بعداً دوباره امتحان کنید.
مقاله‌ها 1–18