Urmăriți
Jiangfei Duan
Jiangfei Duan
Adresă de e-mail confirmată pe ie.cuhk.edu.hk - Pagina de pornire
Titlu
Citat de
Citat de
Anul
Spotserve: Serving generative large language models on preemptible instances
X Miao, C Shi, J Duan, X Xi, D Lin, B Cui, Z Jia
Proceedings of the 29th ACM International Conference on Architectural …, 2024
542024
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
H Duanmu, Z Yuan, X Li, J Duan, X Zhang, D Lin
First Conference on Language Modeling (COLM 24), 2024
182024
Centauri: Enabling efficient scheduling for communication-computation overlap in large model training via communication partitioning
C Chen, X Li, Q Zhu, J Duan, P Sun, X Zhang, C Yang
Proceedings of the 29th ACM International Conference on Architectural …, 2024
152024
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
Q Zhu, J Duan, C Chen, S Liu, X Li, G Feng, X Lv, H Cao, X Chuanfu, ...
arXiv preprint arXiv:2406.15486, 2024
12*2024
Efficient training of large language models on distributed infrastructures: a survey
J Duan, S Zhang, Z Wang, L Jiang, W Qu, Q Hu, G Wang, Q Weng, H Yan, ...
arXiv preprint arXiv:2407.20018, 2024
92024
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances
J Duan, Z Song, X Miao, X Xi, D Lin, H Xu, M Zhang, Z Jia
21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024
72024
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
J Duan, R Lu, H Duanmu, X Li, X Zhang, D Lin, I Stoica, H Zhang
Forty-first International Conference on Machine Learning, 2024
6*2024
Proteus: Simulating the performance of distributed DNN training
J Duan, X Li, P Xu, X Zhang, S Yan, Y Liang, D Lin
IEEE Transactions on Parallel and Distributed Systems, 2024
42024
Sistemul nu poate realiza operația în acest moment. Încercați din nou mai târziu.
Articole 1–8