Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1790 | 2023 |
The RefinedWeb dataset for Falcon LLM: Outperforming curated corpora with web data only G Penedo, Q Malartic, D Hesslow, R Cojocaru, H Alobeidli, A Cappelli, ... Advances in Neural Information Processing Systems 36, 79155-79172, 2023 | 852* | 2023 |
The falcon series of open language models E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ... arXiv preprint arXiv:2311.16867, 2023 | 455 | 2023 |
Falcon-40B: an open large language model with state-of-the-art performance. 2023 E Almazrouei, H Alobeidli, A Alshamsi, A Cappelli, R Cojocaru, M Debbah, ... URL https://falconllm. tii. ae, 2023 | 261* | 2023 |
What language model architecture and pretraining objective works best for zero-shot generalization? T Wang, A Roberts, D Hesslow, T Le Scao, HW Chung, I Beltagy, ... International Conference on Machine Learning, 22964-22984, 2022 | 185 | 2022 |
What language model to train if you have one million gpu hours? TL Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ... arXiv preprint arXiv:2210.15424, 2022 | 117 | 2022 |
Rita: a study on scaling up generative protein sequence models D Hesslow, N Zanichelli, P Notin, I Poli, D Marks arXiv preprint arXiv:2205.05789, 2022 | 86 | 2022 |
BLOOM: A 176b-parameter open-access multilingual language model. CoRR, abs/2211.05100, 2022. doi: 10.48550 T Le Scao, A Fan, C Akiki, E Pavlick, S Ilic, D Hesslow, R Castagné, ... arXiv preprint arXiv.2211.05100 10, 0 | 24 | |
Lighton optical processing unit: Scaling-up AI and HPC with a non von neumann co-processor C Brossollet, A Cappelli, I Carron, C Chaintoutis, A Chatelain, L Daudet, ... arXiv preprint arXiv:2107.11814, 2021 | 11 | 2021 |
Photonic co-processors in HPC: using LightOn OPUs for randomized numerical linear algebra D Hesslow, A Cappelli, I Carron, L Daudet, R Lafargue, K Müller, ... arXiv preprint arXiv:2104.14429, 2021 | 10 | 2021 |
Is the number of trainable parameters all that actually matters? A Chatelain, A Djeghri, D Hesslow, J Launay I (Still) Can't Believe It's Not Better! Workshop at NeurIPS 2021, 27-32, 2022 | 7 | 2022 |
Contrastive embeddings for neural architectures D Hesslow, I Poli arXiv preprint arXiv:2102.04208, 2021 | 7 | 2021 |
Building a Swedish question-answering model H von Essen, D Hesslow Proceedings of the Probability and Meaning Conference (PaM 2020), 117-127, 2020 | 6 | 2020 |
Linear optical random projections without holography R Ohana, D Hesslow, D Brunner, S Gigan, K Müller Optics Express 31 (16), 25881-25888, 2023 | 3 | 2023 |
Scaling Laws Beyond Backpropagation MJ Filipovich, A Cappelli, D Hesslow, J Launay arXiv preprint arXiv:2210.14593, 2022 | 3 | 2022 |
Artificial Neural Network Training on an Optical Processor via Direct Feedback Alignment K Müller, J Launay, I Poli, M Filipovich, A Capelli, D Hesslow, I Carron, ... The European Conference on Lasers and Electro-Optics, jsiii_3_3, 2023 | 1 | 2023 |
Method and system for machine learning using optical data I Poli, J Launay, K Müller, G Pariente, I Carron, L Daudet, R Ohana, ... US Patent 11,574,178, 2023 | | 2023 |
Photonic co-processors in HPC D Hesslow, A Cappelli, I Carron, L Daudet, R Lafargue, K Müller, ... arXiv preprint arXiv:2104.14429, 2021 | | 2021 |
Real-Time Global Illumination in Web-Browsers M Bertilsson, D Hesslow, N Jonsson, S Moos, O Persson, H von Essen | | 2018 |