Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 1300 | 2022 |
Simplified Josephson-junction fabrication process for reproducibly high-performance superconducting qubits A Osman, J Simon, A Bengtsson, S Kosen, P Krantz, D P Lozano, ... Applied Physics Letters 118 (6), 2021 | 89 | 2021 |
Benign, tempered, or catastrophic: Toward a refined taxonomy of overfitting N Mallinar, J Simon, A Abedsoltan, P Pandit, M Belkin, P Nakkiran Advances in Neural Information Processing Systems 35, 1182-1195, 2022 | 74 | 2022 |
The eigenlearning framework: A conservation law perspective on kernel regression and wide neural networks JB Simon, M Dickens, D Karkada, MR DeWeese arXiv preprint arXiv:2110.03922, 2021 | 55* | 2021 |
On the stepwise nature of self-supervised learning JB Simon, M Knutins, L Ziyin, D Geisz, AJ Fetterman, J Albrecht International Conference on Machine Learning, 31852-31876, 2023 | 32 | 2023 |
A spectral condition for feature learning G Yang, JB Simon, J Bernstein arXiv preprint arXiv:2310.17813, 2023 | 22 | 2023 |
Avalon: A benchmark for RL generalization using procedurally generated worlds J Albrecht, A Fetterman, B Fogelman, E Kitanidis, B Wróblewski, N Seo, ... Advances in Neural Information Processing Systems 35, 12813-12825, 2022 | 22 | 2022 |
More is better in modern machine learning: when infinite overparameterization is optimal and overfitting is obligatory JB Simon, D Karkada, N Ghosh, M Belkin arXiv preprint arXiv:2311.14646, 2023 | 18 | 2023 |
Sgd with a constant large learning rate can converge to local maxima L Ziyin, B Li, JB Simon, M Ueda arXiv preprint arXiv:2107.11774, 2021 | 16* | 2021 |
Reverse engineering the neural tangent kernel JB Simon, S Anand, M Deweese International Conference on Machine Learning, 20215-20231, 2022 | 13 | 2022 |
Interleaved electro-optic dual comb generation to expand bandwidth and scan rate for molecular spectroscopy and dynamics studies near 1.6 µm JR Stroud, JB Simon, GA Wagner, DF Plusquellic Optics Express 29 (21), 33155-33170, 2021 | 11 | 2021 |
Critical point-finding methods reveal gradient-flat regions of deep network losses CG Frye, J Simon, NS Wadia, A Ligeralde, MR DeWeese, KE Bouchard Neural computation 33 (6), 1469-1497, 2021 | 9 | 2021 |
An agnostic view on the cost of overfitting in (kernel) ridge regression L Zhou, JB Simon, G Vardi, N Srebro arXiv preprint arXiv:2306.13185, 2023 | 7 | 2023 |
Tune as you scale: Hyperparameter optimization for compute efficient training AJ Fetterman, E Kitanidis, J Albrecht, Z Polizzi, B Fogelman, M Knutins, ... arXiv preprint arXiv:2306.08055, 2023 | 7 | 2023 |
More is Better: when Infinite Overparameterization is Optimal and Overfitting is Obligatory JB Simon, D Karkada, N Ghosh, M Belkin The Twelfth International Conference on Learning Representations, 0 | 6 | |
On Kernel Regression with Data-Dependent Kernels JB Simon arXiv preprint arXiv:2209.01691, 2022 | 3 | 2022 |
Rapid Passage Signals from CO2 at 1.6 µm Using a Dual Chirped-Pulse Electro-Optic Comb System with High-Order Interleaving JR Stroud, J Simon, GA Wagner, DF Plusquellic CLEO: Science and Innovations, SM3A. 1, 2021 | 2 | 2021 |
Fast noise-resistant control of donor nuclear spin qubits in silicon J Simon, FA Calderon-Vargas, E Barnes, SE Economou Physical Review B 101 (20), 205307, 2020 | 2 | 2020 |
The Optimization Landscape of SGD Across the Feature Learning Strength A Atanasov, A Meterez, JB Simon, C Pehlevan arXiv preprint arXiv:2410.04642, 2024 | 1 | 2024 |
Les Houches lectures on deep learning at large and infinite width Y Bahri, B Hanin, A Brossollet, V Erba, C Keup, R Pacelli, JB Simon Journal of Statistical Mechanics: Theory and Experiment 2024 (10), 104012, 2024 | | 2024 |