Studying Large Language Model Generalization with Influence Functions R Grosse*, J Bae*, C Anil*, N Elhage, A Tamkin, A Tajdini, B Steiner, D Li, ... arXiv preprint arXiv:2308.03296, 2023 | 132 | 2023 |
If Influence Functions are the Answer, Then What is the Question? J Bae, N Ng, A Lo, M Ghassemi, RB Grosse Advances in Neural Information Processing Systems 35, 17953-17967, 2022 | 103 | 2022 |
Using Large Language Models for Hyperparameter Optimization MR Zhang, N Desai, J Bae, J Lorraine, J Ba Neural Information Processing Systems (Foundation Models for Decision Making …, 2023 | 36 | 2023 |
Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians J Bae, RB Grosse Advances in Neural Information Processing Systems 33, 21725-21737, 2020 | 33 | 2020 |
Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes J Lucas, J Bae, MR Zhang, S Fort, R Zemel, R Grosse Neural Information Processing Systems (Optimization for Machine Learning …, 2021 | 32* | 2021 |
What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions SK Choe, H Ahn*, J Bae*, K Zhao*, M Kang, Y Chung, A Pratapa, ... arXiv preprint arXiv:2405.13954, 2024 | 24 | 2024 |
Benchmarking Neural Network Training Algorithms GE Dahl, F Schneider, Z Nado, N Agarwal, CS Sastry, P Hennig, ... arXiv preprint arXiv:2306.07179, 2023 | 24 | 2023 |
Eigenvalue Corrected Noisy Natural Gradient J Bae, G Zhang, R Grosse Neural Information Processing Systems (Bayesian Deep Learning Workshop), 2018 | 24 | 2018 |
Training Data Attribution via Approximate Unrolling J Bae, W Lin, J Lorraine, RB Grosse The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024 | 17* | 2024 |
Amortized Proximal Optimization J Bae*, P Vicol*, JZ HaoChen, RB Grosse Advances in Neural Information Processing Systems 35, 8982-8997, 2022 | 17 | 2022 |
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve J Bae, MR Zhang, M Ruan, E Wang, S Hasegawa, J Ba, R Grosse International Conference on Learning Representations 11, 2022 | 16 | 2022 |
Learnable Pooling Methods for Video Classification S Kmiec, J Bae, R An Proceedings of the European Conference on Computer Vision (ECCV), 2018 | 15 | 2018 |
On Monotonic Linear Interpolation of Neural Network Parameters JR Lucas, J Bae, MR Zhang, S Fort, R Zemel, RB Grosse International Conference on Machine Learning, 7168-7179, 2021 | 12 | 2021 |
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective W Lin, F Dangel, R Eschenhagen, J Bae, RE Turner, A Makhzani arXiv preprint arXiv:2402.03496, 2024 | 6 | 2024 |
Efficient Parametric Approximations of Neural Network Function Space Distance N Dhawan, S Huang, J Bae, RB Grosse International Conference on Machine Learning, 7795-7812, 2023 | 5 | 2023 |
CSC 311: Introduction to Machine Learning R Grosse, C Maddison, J Bae, S Pitis University of Toronto, Fall, 2020 | 4 | 2020 |
Procedural knowledge in pretraining drives reasoning in large language models L Ruis, M Mozes, J Bae, SR Kamalakara, D Talupuru, A Locatelli, R Kirk, ... arXiv preprint arXiv:2411.12580, 2024 | 2 | 2024 |
Fast 6DoF Pose Estimation with Synthetic Textureless CAD Model for Mobile Applications B Chen, J Bae, D Mukherjee 2019 IEEE International Conference on Image Processing (ICIP), 2541-2545, 2019 | 2 | 2019 |
Influence Functions for Scalable Data Attribution in Diffusion Models B Mlodozeniec, R Eschenhagen, J Bae, A Immer, D Krueger, R Turner arXiv preprint arXiv:2410.13850, 2024 | 1 | 2024 |
Training Data Attribution (TDA): Examining Its Adoption & Use Cases D Cheng, J Bae, J Bullock, D Kristofferson arXiv preprint arXiv:2501.12642, 2025 | | 2025 |