Chiwun Yang

Citado por

	Total	Desde 2020
Citas	111	111
Índice h	6	6
Índice i10	4	4

20222023202420251 21 81 8

Coautores

Zhao SongSimons Institute for the Theory of Computing, UC BerkeleyDirección de correo verificada de ias.edu
Yichuan DengPh.D. Student, University of WashingtonDirección de correo verificada de cs.washington.edu
Zhenmei ShiResearch Scientist at Voyage AI; PhD from University of Wisconsin–MadisonDirección de correo verificada de cs.wisc.edu
Yingyu LiangUW-MadisonDirección de correo verificada de cs.wisc.edu
Shenghao XieTexas A&M UniversityDirección de correo verificada de tamu.edu
Weixin WangJohns Hopkins UniversityDirección de correo verificada de jh.edu
Majid DaliriNew York UniversityDirección de correo verificada de nyu.edu

Seguir

Chiwun Yang

Otros nombres杨智桓

Student of Artificial Intelligence, Sun Yat-sen University

Dirección de correo verificada de mail2.sysu.edu.cn - Página principal

Artificial Intelligence Nature Language Process Deep Learning


Título Ordenar por citas Ordenar por año Ordenar por título	Citado por Citado por	Año
How to Protect Copyright Data in Optimization of Large Language Models? T Chu, Z Song, C Yang Proceedings of the AAAI Conference on Artificial Intelligence 38 (16), 17871 …, 2024	42	2024
Towards Infinite-Long Prefix in Transformer Y Liang, Z Shi, Z Song, C Yang arXiv preprint arXiv:2406.14036, 2024	17*	2024
Fine-tune language models to approximate unbiased in-context learning T Chu, Z Song, C Yang arXiv preprint arXiv:2310.03331, 2023	16	2023
Unmasking transformers: A theoretical approach to data recovery via attention weights Y Deng, Z Song, S Xie, C Yang arXiv preprint arXiv:2310.12462, 2023	10	2023
An automatic learning rate schedule algorithm for achieving faster convergence and steeper descent Z Song, C Yang arXiv preprint arXiv:2310.11291, 2023	8	2023
Attention is Naturally Sparse with Gaussian Distributed Input Y Deng, Z Song, C Yang arXiv preprint arXiv:2404.02690, 2024	7	2024
A theoretical insight into attack and defense of gradient leakage in transformer C Li, Z Song, W Wang, C Yang arXiv preprint arXiv:2311.13624, 2023	6	2023
Curse of attention: A kernel-based perspective for why transformers fail to generalize on time series forecasting and beyond Y Ke, Y Liang, Z Shi, Z Song, C Yang arXiv preprint arXiv:2412.06061, 2024	2	2024
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space R Addanki, C Li, Z Song, C Yang arXiv preprint arXiv:2311.14652, 2023	2	2023
Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence Y Deng, Z Song, C Yang arXiv preprint arXiv:2402.01515, 2024	1	2024
Unlocking the Theory Behind Scaling 1-Bit Neural Networks M Daliri, Z Song, C Yang arXiv preprint arXiv:2411.01663, 2024		2024

El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.

Artículos 1–11

Citas por año

Citas duplicadas

Citas combinadas

Añadir coautoresCoautores

Seguir

Citado por

Coautores