Fine-tuning language models with just forward passes S Malladi, T Gao, E Nichani, A Damian, JD Lee, D Chen, S Arora Advances in Neural Information Processing Systems 36, 53038-53075, 2023 | 181 | 2023 |
Self-stabilization: The implicit bias of gradient descent at the edge of stability A Damian, E Nichani, JD Lee arXiv preprint arXiv:2209.15594, 2022 | 85 | 2022 |
Assessment of circulating copy number variant detection for cancer screening B Molparia, E Nichani, A Torkamani PloS one 12 (7), e0180647, 2017 | 57 | 2017 |
How transformers learn causal structure with gradient descent E Nichani, A Damian, JD Lee arXiv preprint arXiv:2402.14735, 2024 | 50 | 2024 |
Smoothing the landscape boosts the signal for sgd: Optimal sample complexity for learning single index models A Damian, E Nichani, R Ge, JD Lee Advances in Neural Information Processing Systems 36, 2023 | 39 | 2023 |
Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks E Nichani, A Radhakrishnan, C Uhler arXiv preprint arXiv:2010.09610, 2020 | 24* | 2020 |
Causal structure discovery between clusters of nodes induced by latent factors C Squires, A Yun, E Nichani, R Agrawal, C Uhler Conference on Causal Learning and Reasoning, 669-687, 2022 | 20 | 2022 |
Provable guarantees for nonlinear feature learning in three-layer neural networks E Nichani, A Damian, JD Lee Advances in Neural Information Processing Systems 36, 2023 | 16 | 2023 |
Identifying good directions to escape the ntk regime and efficiently learn low-degree plus sparse polynomials E Nichani, Y Bai, JD Lee Advances in Neural Information Processing Systems 35, 14568-14581, 2022 | 13 | 2022 |
Learning hierarchical polynomials with three-layer neural networks Z Wang, E Nichani, JD Lee arXiv preprint arXiv:2311.13774, 2023 | 10 | 2023 |
On Alignment in Deep Linear Neural Networks A Radhakrishnan, E Nichani, D Bernstein, C Uhler arXiv preprint arXiv:2003.06340, 2020 | 6* | 2020 |
Metastable mixing of Markov chains: Efficiently sampling low temperature exponential random graphs G Bresler, D Nagaraj, E Nichani The Annals of Applied Probability 34 (1A), 517-554, 2024 | 4 | 2024 |
An Empirical and Theoretical Analysis of the Role of Depth in Convolutional Neural Networks E Nichani Massachusetts Institute of Technology, 2021 | 1 | 2021 |
Understanding Factual Recall in Transformers via Associative Memories E Nichani, JD Lee, A Bietti arXiv preprint arXiv:2412.06538, 2024 | | 2024 |
Learning Hierarchical Polynomials of Multiple Nonlinear Features with Three-Layer Networks H Fu, Z Wang, E Nichani, JD Lee arXiv preprint arXiv:2411.17201, 2024 | | 2024 |