Sparks of artificial general intelligence: Early experiments with gpt-4 S Bubeck, V Chadrasekaran, R Eldan, J Gehrke, E Horvitz, E Kamar, ... ArXiv, 2023 | 3788 | 2023 |
The power of depth for feedforward neural networks R Eldan, O Shamir Conference on learning theory, 907-940, 2016 | 1046 | 2016 |
Phi-3 technical report: A highly capable language model locally on your phone M Abdin, J Aneja, H Awadalla, A Awadallah, AA Awan, N Bach, A Bahree, ... arXiv preprint arXiv:2404.14219, 2024 | 861 | 2024 |
Textbooks are all you need S Gunasekar, Y Zhang, J Aneja, CCT Mendes, A Del Giorno, S Gopi, ... arXiv preprint arXiv:2306.11644, 2023 | 568 | 2023 |
Textbooks are all you need ii: phi-1.5 technical report Y Li, S Bubeck, R Eldan, A Del Giorno, S Gunasekar, YT Lee arXiv preprint arXiv:2309.05463, 2023 | 417 | 2023 |
Phi-2: The surprising power of small language models M Javaheripi, S Bubeck, M Abdin, J Aneja, S Bubeck, CCT Mendes, ... Microsoft Research Blog 1 (3), 3, 2023 | 213 | 2023 |
Tinystories: How small can language models be and still speak coherent english? R Eldan, Y Li arXiv preprint arXiv:2305.07759, 2023 | 205 | 2023 |
Kernel-based methods for bandit convex optimization S Bubeck, R Eldan, YT Lee Journal of the ACM (JACM) 68 (4), 1-35, 2021 | 189 | 2021 |
Testing for high‐dimensional geometry in random graphs S Bubeck, J Ding, R Eldan, MZ Rácz Random Structures & Algorithms 49 (3), 503-532, 2016 | 170 | 2016 |
Thin shell implies spectral gap up to polylog via a stochastic localization scheme R Eldan Geometric and Functional Analysis 23 (2), 532-569, 2013 | 169 | 2013 |
Sampling from a log-concave distribution with projected Langevin Monte Carlo S Bubeck, R Eldan, J Lehec Discrete & Computational Geometry 59, 757-783, 2018 | 166 | 2018 |
Who's harry potter? approximate unlearning in llms R Eldan, M Russinovich arXiv preprint arXiv:2310.02238, 2023 | 154 | 2023 |
& Zhang, Y.(2023). Sparks of artificial general intelligence: Early experiments with gpt-4 S Bubeck, V Chandrasekaran, R Eldan, J Gehrke, E Horvitz, E Kamar arXiv preprint arXiv:2303.12712 10, 0 | 130 | |
Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations R Eldan Geometric and Functional Analysis 28 (6), 1548-1596, 2018 | 100 | 2018 |
Localization schemes: A framework for proving mixing bounds for Markov chains Y Chen, R Eldan 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS …, 2022 | 94 | 2022 |
Multi-scale exploration of convex functions and bandit convex optimization S Bubeck, R Eldan Conference on Learning Theory, 583-589, 2016 | 88 | 2016 |
Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv 2023 S Bubeck, V Chandrasekaran, R Eldan, J Gehrke, E Horvitz, E Kamar, ... arXiv preprint arXiv:2303.12712 10, 2024 | 87 | 2024 |
A two-sided estimate for the Gaussian noise stability deficit R Eldan Inventiones mathematicae 201, 561-624, 2015 | 87 | 2015 |
A spectral condition for spectral gap: fast mixing in high-temperature Ising models R Eldan, F Koehler, O Zeitouni Probability theory and related fields 182 (3), 1035-1051, 2022 | 74 | 2022 |
Unveiling transformers with lego: a synthetic reasoning task Y Zhang, A Backurs, S Bubeck, R Eldan, S Gunasekar, T Wagner arXiv preprint arXiv:2206.04301, 2022 | 72 | 2022 |