Guangxuan Xiao

Cytowane przez

	Wszystkie	Od 2020
Cytowania	2342	2341
h-indeks	11	11
i10-indeks	11	11

1900

950

475

1425

202220232024202525 280 1848 180

Dostęp publiczny

Wyświetl wszystko

4 artykuły

0 artykułów

dostępne

niedostępne

Objęte finansowaniem

Współautorzy

Song HanMassachusetts Institute of TechnologyZweryfikowany adres z mit.edu
Ji LinOpenAIZweryfikowany adres z mit.edu
Beidi ChenCarnegie Mellon UniversityZweryfikowany adres z andrew.cmu.edu
Yuandong TianResearch Scientist, Meta AI (FAIR)Zweryfikowany adres z fb.com
Mike LewisFacebook AI ResearchZweryfikowany adres z fb.com
Fredo DurandProfessor of Computer Science, MITZweryfikowany adres z mit.edu
William T. FreemanProfessor of Computer Science, MITZweryfikowany adres z mit.edu

Obserwuj

Guangxuan Xiao

Ph.D. candidate, MIT

Zweryfikowany adres z mit.edu - Strona główna

Deep Learning Machine Learning


Tytuł Sortuj wg cytatów Sortuj wg roku Sortuj wg tytułu	Cytowane przez Cytowane przez	Rok
SmoothQuant: Accurate and efficient post-training quantization for large language models G Xiao, J Lin, M Seznec, H Wu, J Demouth, S Han International Conference on Machine Learning, 38087-38099, 2023	812	2023
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration J Lin, J Tang, H Tang, S Yang, WM Chen, WC Wang, G Xiao, X Dang, ... Proceedings of Machine Learning and Systems 6, 87-100, 2024	629	2024
Efficient streaming language models with attention sinks G Xiao, Y Tian, B Chen, S Han, M Lewis International Conference on Learning Representations (ICLR), 2024	409	2024
Fastcomposer: Tuning-free multi-subject image generation with localized attention G Xiao, T Yin, WT Freeman, F Durand, S Han International Journal of Computer Vision, 1-20, 2024	171	2024
Red alarm for pre-trained models: Universal vulnerability to neuron-level backdoor attacks Z Zhang, G Xiao, Y Li, T Lv, F Qi, Z Liu, Y Wang, X Jiang, M Sun Machine Intelligence Research 20 (2), 180-193, 2023	93	2023
Offsite-tuning: Transfer learning without full model G Xiao, J Lin, S Han arXiv preprint arXiv:2302.04870, 2023	66	2023
Qserve: W4a8kv4 quantization and system co-design for efficient llm serving Y Lin, H Tang, S Yang, Z Zhang, G Xiao, C Gan, S Han arXiv preprint arXiv:2405.04532, 2024	43	2024
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference J Tang, Y Zhao, K Zhu, G Xiao, B Kasikci, S Han ICML 2024, 2024	31	2024
Retrieval head mechanistically explains long-context factuality W Wu, Y Wang, G Xiao, H Peng, Y Fu International Conference on Learning Representations (ICLR), 2025	28	2025
Infllm: Unveiling the intrinsic capacity of llms for understanding extremely long sequences with training-free memory C Xiao, P Zhang, X Han, G Xiao, Y Lin, Z Zhang, Z Liu, S Han, M Sun NeurIPS 2024, 2024	25	2024
Bitdelta: Your fine-tune may only be worth one bit J Liu, G Xiao, K Li, JD Lee, S Han, T Dao, T Cai NeurIPS 2024, 2024	16	2024
Duoattention: Efficient long-context llm inference with retrieval and streaming heads G Xiao, J Tang, J Zuo, J Guo, S Yang, H Tang, Y Fu, S Han International Conference on Learning Representations (ICLR), 2025	8	2025
ReFresh: Reducing memory access from exploiting stable historical embeddings for graph neural network training K Huang, H Jiang, M Wang, G Xiao, D Wipf, X Song, Q Gan, Z Huang, ... arXiv preprint arXiv:2301.07482, 2023	7	2023
FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training K Huang, H Jiang, M Wang, G Xiao, D Wipf, X Song, Q Gan, Z Huang, ... Proceedings of the VLDB Endowment 17 (6), 1473-1486, 2024	4	2024
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration J Lin, J Tang, H Tang, S Yang, G Xiao, S Han GetMobile: Mobile Computing and Communications 28 (4), 12-17, 2025		2025
Efficient Deployment Algorithms for Large Language Models G Xiao Massachusetts Institute of Technology, 2024		2024
Sparse and Local Networks for Hypergraph Reasoning G Xiao, LP Kaelbling, J Wu, J Mao Learning on Graphs Conference, 34: 1-34: 16, 2022		2022

Nie można teraz wykonać tej operacji. Spróbuj ponownie później.

Prace 1–17

Cytowania rocznie

Powielone cytowania

Scalone cytowania

Dodaj współautorówWspółautorzy

Obserwuj

Cytowane przez

Współautorzy