A fast optimization view: Reformulating single layer attention in llm based on tensor and svm trick, and solving it in matrix multiplication time Y Gao, Z Song, W Wang, J Yin arXiv preprint arXiv:2309.07418, 2023 | 41 | 2023 |
An iterative algorithm for rescaled hyperbolic functions regression Y Gao, Z Song, J Yin The 28th International Conference on Artificial Intelligence and Statistics …, 2025 | 34 | 2025 |
Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers Y Liang, H Liu, Z Shi, Z Song, Z Xu, J Yin arXiv preprint arXiv:2405.05219, 2024 | 30 | 2024 |
Low rank matrix completion via robust alternating minimization in nearly linear time Y Gu, Z Song, J Yin, L Zhang The Twelfth International Conference on Learning Representations (ICLR 2024), 2024 | 26 | 2024 |
Gradientcoin: A peer-to-peer decentralized large language models Y Gao, Z Song, J Yin arXiv preprint arXiv:2308.10502, 2023 | 26 | 2023 |
Solving attention kernel regression problem via pre-conditioner Z Song, J Yin, L Zhang The 27th International Conference on Artificial Intelligence and Statistics …, 2024 | 18 | 2024 |
A unified scheme of resnet and softmax Z Song, W Wang, J Yin arXiv preprint arXiv:2309.13482, 2023 | 13 | 2023 |
Efficient alternating minimization with applications to weighted low rank approximation Z Song, M Ye, J Yin, L Zhang The Twelfth International Conference on Learning Representations (ICLR 2025), 2025 | 9 | 2025 |
Local convergence of approximate newton method for two layer nonlinear regression Z Li, Z Song, Z Wang, J Yin arXiv preprint arXiv:2311.15390, 2023 | 8 | 2023 |
Dynamical fractal: Theory and case study J Yin Chaos, Solitons & Fractals 176, 114190, 2023 | 8 | 2023 |
A Nearly-Optimal Bound for Fast Regression with Guarantee Z Song, M Ye, J Yin, L Zhang International Conference on Machine Learning, 32463-32482, 2023 | 8 | 2023 |
Federated empirical risk minimization via second-order method S Bian, Z Song, J Yin arXiv preprint arXiv:2305.17482, 2023 | 8 | 2023 |
The expressibility of polynomial based attention scheme Z Song, G Xu, J Yin arXiv preprint arXiv:2310.20051, 2023 | 5 | 2023 |
InstaHide's Sample Complexity When Mixing Two Private Images B Huang, Z Song, R Tao, J Yin, R Zhang, D Zhuo arXiv preprint arXiv:2011.11877, 2020 | 5 | 2020 |
Revisiting Quantum Algorithms for Linear Regressions: Quadratic Speedups without Data-Dependent Parameters Z Song, J Yin, R Zhang The 28th Annual Quantum Information Processing Conference (QIP 2025), 2025 | 4 | 2025 |
Faster robust tensor power method for arbitrary order Y Deng, Z Song, J Yin arXiv preprint arXiv:2306.00406, 2023 | 4 | 2023 |
A Faster -means++ Algorithm J Liang, S Sarkhel, Z Song, C Yin, J Yin, D Zhuo arXiv preprint arXiv:2211.15118, 2022 | 4 | 2022 |
A Dynamic Low-Rank Fast Gaussian Transform B Huang, Z Song, O Weinstein, J Yin, H Zhang, R Zhang arXiv preprint arXiv:2202.12329, 2022 | 4 | 2022 |
Fast and efficient matching algorithm with deadline instances Z Song, W Wang, C Yin, J Yin Conference on Parsimony and Learning (CPAL 2025), 2025 | 3 | 2025 |
Sublinear Time Algorithm for Online Weighted Bipartite Matching H Hu, Z Song, R Tao, Z Xu, J Yin, D Zhuo arXiv preprint arXiv:2208.03367, 2022 | 3 | 2022 |