Seguir
Chiwun Yang
Chiwun Yang
Otros nombres杨 智桓
Student of Artificial Intelligence, Sun Yat-sen University
Dirección de correo verificada de mail2.sysu.edu.cn - Página principal
Título
Citado por
Citado por
Año
How to Protect Copyright Data in Optimization of Large Language Models?
T Chu, Z Song, C Yang
Proceedings of the AAAI Conference on Artificial Intelligence 38 (16), 17871 …, 2024
422024
Towards Infinite-Long Prefix in Transformer
Y Liang, Z Shi, Z Song, C Yang
arXiv preprint arXiv:2406.14036, 2024
17*2024
Fine-tune language models to approximate unbiased in-context learning
T Chu, Z Song, C Yang
arXiv preprint arXiv:2310.03331, 2023
162023
Unmasking transformers: A theoretical approach to data recovery via attention weights
Y Deng, Z Song, S Xie, C Yang
arXiv preprint arXiv:2310.12462, 2023
102023
An automatic learning rate schedule algorithm for achieving faster convergence and steeper descent
Z Song, C Yang
arXiv preprint arXiv:2310.11291, 2023
82023
Attention is Naturally Sparse with Gaussian Distributed Input
Y Deng, Z Song, C Yang
arXiv preprint arXiv:2404.02690, 2024
72024
A theoretical insight into attack and defense of gradient leakage in transformer
C Li, Z Song, W Wang, C Yang
arXiv preprint arXiv:2311.13624, 2023
62023
Curse of attention: A kernel-based perspective for why transformers fail to generalize on time series forecasting and beyond
Y Ke, Y Liang, Z Shi, Z Song, C Yang
arXiv preprint arXiv:2412.06061, 2024
22024
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space
R Addanki, C Li, Z Song, C Yang
arXiv preprint arXiv:2311.14652, 2023
22023
Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence
Y Deng, Z Song, C Yang
arXiv preprint arXiv:2402.01515, 2024
12024
Unlocking the Theory Behind Scaling 1-Bit Neural Networks
M Daliri, Z Song, C Yang
arXiv preprint arXiv:2411.01663, 2024
2024
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–11