Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape J Kim, T Suzuki
International Conference on Machine Learning, 2024
21 2024 Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems J Kim, K Yamamoto, K Oko, Z Yang, T Suzuki
The Twelfth International Conference on Learning Representations, 2024
11 2024 Transformers are Minimax Optimal Nonparametric In-Context Learners J Kim, T Nakamaki, T Suzuki
2024 Conference on Neural Information Processing Systems, 2024
7 2024 Transformers Provably Solve Parity Efficiently with Chain of Thought J Kim, T Suzuki
The Thirteenth International Conference on Learning Representations, 2025
3 2025 Reeb flows without simple global surfaces of section J Kim, Y Kim, O van Koert
Involve, a Journal of Mathematics 15 (5), 813-842, 2023
3 2023 -Variational Autoencoder: Learning Heavy-tailed Data with Student's t and Power DivergenceJ Kim, J Kwon, M Cho, H Lee, JH Won
The Twelfth International Conference on Learning Representations, 2024
1 2024 Hessian Based Smoothing Splines for Manifold Learning J Kim
arXiv preprint arXiv:2302.05025, 2023
1 2023 Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression J Kim, D Meunier, A Gretton, T Suzuki, Z Li
The Thirteenth International Conference on Learning Representations, 2025
2025 A Central Limit Theorem for Rosen Continued Fractions J Kim, K Choi
arXiv preprint arXiv:2009.02047, 2020
2020