Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... Journal of Machine Learning Research 24 (240), 1-113, 2023 | 5581 | 2023 |
Palm 2 technical report R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 2023 | 1558 | 2023 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 1338 | 2022 |
Black holes and random matrices JS Cotler, G Gur-Ari, M Hanada, J Polchinski, P Saad, SH Shenker, ... Journal of High Energy Physics 2017 (5), 1-54, 2017 | 989 | 2017 |
Solving quantitative reasoning problems with language models A Lewkowycz, A Andreassen, D Dohan, E Dyer, H Michalewski, ... Advances in Neural Information Processing Systems 35, 3843-3857, 2022 | 711 | 2022 |
Show your work: Scratchpads for intermediate computation with language models M Nye, AJ Andreassen, G Gur-Ari, H Michalewski, J Austin, D Bieber, ... | 644 | 2021 |
D= 3 bosonic vector models coupled to Chern-Simons gauge theories O Aharony, G Gur-Ari, R Yacoby Journal of High Energy Physics 2012 (3), 1-25, 2012 | 349 | 2012 |
Correlation functions of large N Chern-Simons-matter theories and bosonization in three dimensions O Aharony, G Gur-Ari, R Yacoby Journal of High Energy Physics 2012 (12), 1-37, 2012 | 276 | 2012 |
The large learning rate phase of deep learning: the catapult mechanism A Lewkowycz, Y Bahri, E Dyer, J Sohl-Dickstein, G Gur-Ari arXiv preprint arXiv:2003.02218, 2020 | 244 | 2020 |
Gradient descent happens in a tiny subspace G Gur-Ari, DA Roberts, E Dyer arXiv preprint arXiv:1812.04754, 2018 | 224 | 2018 |
Exploring length generalization in large language models C Anil, Y Wu, A Andreassen, A Lewkowycz, V Misra, V Ramasesh, ... Advances in Neural Information Processing Systems 35, 38546-38556, 2022 | 206 | 2022 |
The thermal free energy in large N Chern-Simons-matter theories O Aharony, S Giombi, G Gur-Ari, J Maldacena, R Yacoby Journal of High Energy Physics 2013 (3), 1-38, 2013 | 162 | 2013 |
Palm: Scaling language modeling with pathways, 2022 A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311, 2022 | 133 | 2022 |
Asymptotics of wide networks from feynman diagrams E Dyer, G Gur-Ari arXiv preprint arXiv:1909.11304, 2019 | 132 | 2019 |
Correlators of large N fermionic Chern-Simons vector models G Gur-Ari, R Yacoby Journal of High Energy Physics 2013 (2), 1-17, 2013 | 128 | 2013 |
Palm: Scaling language modeling with pathways. arXiv 2022 A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311 10, 1, 2022 | 123 | 2022 |
Three dimensional bosonization from supersymmetry G Gur-Ari, R Yacoby Journal of High Energy Physics 2015 (11), 1-32, 2015 | 102 | 2015 |
Chaos in classical D0-brane mechanics G Gur-Ari, M Hanada, SH Shenker Journal of High Energy Physics 2016 (2), 1-31, 2016 | 93 | 2016 |
Palm 2 technical report. arXiv 2023 R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ... arXiv preprint arXiv:2305.10403, 0 | 83 | |
2D CFT partition functions at late times E Dyer, G Gur-Ari Journal of High Energy Physics 2017 (8), 1-35, 2017 | 80 | 2017 |