Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic J Gu, C Li, Y Liang, Z Shi, Z Song, T Zhou arXiv preprint arXiv:2402.09469, 2024 | 18 | 2024 |
Exploring the frontiers of softmax: Provable optimization, applications in diffusion model, and beyond J Gu, C Li, Y Liang, Z Shi, Z Song arXiv preprint arXiv:2405.03251, 2024 | 14 | 2024 |
A theoretical insight into attack and defense of gradient leakage in transformer C Li, Z Song, W Wang, C Yang arXiv preprint arXiv:2311.13624, 2023 | 7 | 2023 |
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space R Addanki, C Li, Z Song, C Yang arXiv preprint arXiv:2311.14652, 2023 | 2 | 2023 |
Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method C Li, Z Song, Z Xu, J Yin arXiv preprint arXiv:2408.11267, 2024 | | 2024 |
Research on field optimization model of heliostat based on ray tracing J Guo, C Li, H Lu Highlights in Science, Engineering and Technology 82, 390-399, 2024 | | 2024 |