Stebėti
Shengyu Liu
Shengyu Liu
Patvirtintas el. paštas stu.pku.edu.cn - Pagrindinis puslapis
Pavadinimas
Cituota
Cituota
Metai
{DistServe}: Disaggregating prefill and decoding for goodput-optimized large language model serving
Y Zhong, S Liu, J Chen, J Hu, Y Zhu, X Liu, X Jin, H Zhang
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
1232024
Fast distributed inference serving for large language models
B Wu, Y Zhong, Z Zhang, S Liu, F Liu, Y Sun, G Huang, X Liu, X Jin
arXiv preprint arXiv:2305.05920, 2023
882023
Loongserve: Efficiently serving long-context large language models with elastic sequence parallelism
B Wu, S Liu, Y Zhong, P Sun, X Liu, X Jin
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles …, 2024
262024
RLHFuse: Efficient RLHF Training for Large Language Models with Inter-and Intra-Stage Fusion
Y Zhong, Z Zhang, B Wu, S Liu, Y Chen, C Wan, H Hu, L Xia, R Ming, ...
arXiv preprint arXiv:2409.13221, 2024
42024
SwiftLLM: A tiny yet powerful LLM inference system tailored for researching purpose
S Liu
https://github.com/interestingLSY/swiftLLM, 2024
2024
{DistServe}: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
Y Zhong, S Liu, J Chen, J Hu, Y Zhu, X Liu, X Jin, H Zhang
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
2024
Sistema negali atlikti operacijos. Bandykite vėliau dar kartą.
Straipsniai 1–6