Judging llm-as-a-judge with mt-bench and chatbot arena L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ... Advances in Neural Information Processing Systems 36, 46595-46623, 2023 | 3049* | 2023 |
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ... See https://vicuna. lmsys. org (accessed 14 April 2023) 2 (3), 6, 2023 | 2606* | 2023 |
Efficient memory management for large language model serving with pagedattention W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ... Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023 | 1271 | 2023 |
Lmsys-chat-1m: A large-scale real-world llm conversation dataset L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ... arXiv preprint arXiv:2309.11998, 2023 | 118 | 2023 |
Terapipe: Token-level pipeline parallelism for training large-scale language models Z Li, S Zhuang, S Guo, D Zhuo, H Zhang, D Song, I Stoica International Conference on Machine Learning, 6543-6552, 2021 | 107 | 2021 |
{SkyPilot}: An intercloud broker for sky computing Z Yang, Z Wu, M Luo, WL Chiang, R Bhardwaj, W Kwon, S Zhuang, ... 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023 | 79 | 2023 |
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. arXiv 2023 L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ... arXiv preprint arXiv:2306.05685, 0 | 51 | |
Hoplite: efficient and fault-tolerant collective communication for task-based distributed systems S Zhuang, Z Li, D Zhuo, S Wang, E Liang, R Nishihara, P Moritz, I Stoica Proceedings of the 2021 ACM SIGCOMM 2021 Conference, 641-656, 2021 | 28 | 2021 |
sensai: Convnets decomposition via class parallelism for fast inference on live data G Wang, Z Liu, B Hsieh, S Zhuang, J Gonzalez, T Darrell, I Stoica Proceedings of Machine Learning and Systems 3, 664-679, 2021 | 18 | 2021 |
Composing MPC with LQR and Neural Network for Amortized Efficiency and Stable Control F Wu, G Wang, S Zhuang, K Wang, A Keimer, I Stoica, A Bayen arXiv preprint arXiv:2112.07238, 2021 | 9 | 2021 |
{ExoFlow}: A universal workflow system for {Exactly-Once}{DAGs} S Zhuang, S Wang, E Liang, Y Cheng, I Stoica 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023 | 7 | 2023 |
Sensai: Fast convnets serving on live data via class parallelism G Wang, Z Liu, S Zhuang, B Hsieh, J Gonzalez, I Stoica MLOps Systems workshop in MLSys, 2020 | 7 | 2020 |
Judgebench: A benchmark for evaluating llm-based judges S Tan, S Zhuang, K Montgomery, WY Tang, A Cuadron, C Wang, ... arXiv preprint arXiv:2410.12784, 2024 | 5 | 2024 |
Rearchitecting in-memory object stores for low latency D Zhuo, K Zhang, Z Li, S Zhuang, S Wang, A Chen, I Stoica Proceedings of the VLDB Endowment, 555-568, 2021 | 3 | 2021 |
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks A Cuadron, D Li, W Ma, X Wang, Y Wang, S Zhuang, S Liu, LG Schroeder, ... arXiv preprint arXiv:2502.08235, 2025 | | 2025 |
A Statistical Framework for Ranking LLM-Based Chatbots S Ameli, S Zhuang, I Stoica, MW Mahoney arXiv preprint arXiv:2412.18407, 2024 | | 2024 |
Starburst: a cost-aware scheduler for hybrid cloud M Luo, S Zhuang, S Vengadesan, R Bhardwaj, J Chang, E Friedman, ... Proceedings of the 2024 USENIX Conference on Usenix Annual Technical …, 2024 | | 2024 |
Providing Efficient Fault Tolerance in Distributed Systems S Zhuang University of California, Berkeley, 2024 | | 2024 |
Starburst: A Cost-aware Scheduler for Hybrid Cloud M Luo, S Zhuang, S Vengadesan, R Bhardwaj, J Chang, E Friedman, ... 2024 USENIX Annual Technical Conference (USENIX ATC 24), 37-57, 2024 | | 2024 |
AVOIDING GPU OOM FOR DYNAMIC COMPUTATIONAL GRAPHS TRAINING S Zhuang | | |