A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters Y Jiang, Y Zhu, C Lan, B Yi, Y Cui, C Guo 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2020 | 346 | 2020 |
Accelerating Distributed MoE Training and Inference with Lina J Li, Y Jiang, Y Zhu, C Wang, H Xu 2023 USENIX Annual Technical Conference (USENIX ATC 23), 945-959, 2023 | 43 | 2023 |
Janus: A Unified Distributed Training Framework for Sparse Mixture-of-Experts Models J Liu, JH Wang, Y Jiang Proceedings of the ACM SIGCOMM 2023 Conference, 486-498, 2023 | 25 | 2023 |
Adaptive Gating in Mixture-of-Experts based Language Models J Li, Q Su, Y Yang, Y Jiang, C Wang, H Xu Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 13 | 2023 |
Teola: Towards End-to-End Optimization of LLM-based Applications X Tan, Y Jiang, Y Yang, H Xu arXiv preprint arXiv:2407.00326, 2024 | 4 | 2024 |
Lita: Accelerating Distributed Training of Sparsely Activated Models J Li, Y Jiang, Y Zhu, C Wang, H Xu arXiv preprint arXiv:2210.17223, 2022 | 2 | 2022 |
DSV: Exploiting Dynamic Sparsity to Accelerate Large-Scale Video DiT Training X Tan, Y Chen, Y Jiang, X Chen, K Yan, N Duan, Y Zhu, D Jiang, H Xu arXiv preprint arXiv:2502.07590, 2025 | | 2025 |
InfinitePOD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers C Shou, G Liu, H Nie, H Meng, Y Zhou, Y Jiang, W Lv, Y Xu, Y Lu, Z Chen, ... arXiv preprint arXiv:2502.03885, 2025 | | 2025 |
分布式深度学习训练的通信加速 Y Jiang, Y Peng, Y Zhu, C Guo Communications of the CCF 17 (9), 50-57, 2021 | | 2021 |