팔로우
Yutao Sun
Yutao Sun
mails.tsinghua.edu.cn의 이메일 확인됨 - 홈페이지
제목
인용
인용
연도
Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers
D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui, F Wei
arXiv preprint arXiv:2212.10559, 2022
3932022
Retentive network: A successor to transformer for large language models
Y Sun, L Dong, S Huang, S Ma, Y Xia, J Xue, J Wang, F Wei
arXiv preprint arXiv:2307.08621, 2023
3262023
A length-extrapolatable transformer
Y Sun, L Dong, B Patra, S Ma, S Huang, A Benhaim, V Chaudhary, ...
arXiv preprint arXiv:2212.10554, 2022
1512022
Structured prompting: Scaling in-context learning to 1,000 examples
Y Hao, Y Sun, L Dong, Z Han, Y Gu, F Wei
arXiv preprint arXiv:2212.06713, 2022
502022
Prototypical calibration for few-shot learning of language models
Z Han, Y Hao, L Dong, Y Sun, F Wei
arXiv preprint arXiv:2205.10183, 2022
412022
You only cache once: Decoder-decoder architectures for language models
Y Sun, L Dong, Y Zhu, S Huang, W Wang, S Ma, Q Zhang, J Wang, F Wei
Advances in Neural Information Processing Systems 37, 7339-7361, 2025
332025
Differential transformer
T Ye, L Dong, Y Xia, Y Sun, Y Zhu, G Huang, F Wei
arXiv preprint arXiv:2410.05258, 2024
192024
FocusLLM: Scaling LLM's Context by Parallel Decoding
Z Li, Y Zhang, T Pan, Y Sun, Z Duan, J Fang, R Han, Z Wang, J Wang
arXiv preprint arXiv:2408.11745, 2024
32024
Multimodal Latent Language Modeling with Next-Token Diffusion
Y Sun, H Bao, W Wang, Z Peng, L Dong, S Huang, J Wang, F Wei
arXiv preprint arXiv:2412.08635, 2024
12024
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–9