Secondnet: a data center network virtualization architecture with bandwidth guarantees C Guo, G Lu, HJ Wang, S Yang, C Kong, P Sun, W Wu, Y Zhang Proceedings of the 6th International COnference, 1-12, 2010 | 842 | 2010 |
Internlm2 technical report Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ... arXiv preprint arXiv:2403.17297, 2024 | 210 | 2024 |
Characterization and prediction of deep learning workloads in large-scale gpu datacenters Q Hu, P Sun, S Yan, Y Wen, T Zhang Proceedings of the International Conference for High Performance Computing …, 2021 | 137 | 2021 |
A Network-state Management Service P Sun, R Mahajan, J Rexford, L Yuan, M Zhang, A Arefin Proceedings of the 2014 ACM conference on SIGCOMM, 563-574, 2014 | 119 | 2014 |
Identifying performance bottlenecks in CDNs through TCP-level monitoring P Sun, M Yu, MJ Freedman, J Rexford Proceedings of the first ACM SIGCOMM workshop on Measurements up the stack …, 2011 | 106 | 2011 |
Hone: Joint host-network traffic management in software-defined networks P Sun, M Yu, MJ Freedman, J Rexford, D Walker Journal of Network and Systems Management 23, 374-399, 2015 | 84 | 2015 |
Optimizing network performance for distributed dnn training on gpu clusters: Imagenet/alexnet training in 1.5 minutes P Sun, W Feng, R Han, S Yan, Y Wen arXiv preprint arXiv:1902.06855, 2019 | 83 | 2019 |
Towards distributed machine learning in shared clusters: A dynamically-partitioned approach P Sun, Y Wen, NBD Ta, S Yan 2017 IEEE International Conference on Smart Computing (SMARTCOMP), 1-6, 2017 | 50 | 2017 |
Unbiased sampling in directed social graph T Wang, Y Chen, Z Zhang, P Sun, B Deng, X Li ACM SIGCOMM Computer Communication Review 40 (4), 401-402, 2010 | 49 | 2010 |
Chronus: A novel deadline-aware scheduler for deep learning training jobs W Gao, Z Ye, P Sun, Y Wen, T Zhang Proceedings of the ACM Symposium on Cloud Computing, 609-623, 2021 | 42 | 2021 |
Gradientflow: Optimizing network performance for large-scale distributed dnn training P Sun, Y Wen, R Han, W Feng, S Yan IEEE Transactions on Big Data 8 (2), 495-507, 2019 | 36 | 2019 |
Deep learning workload scheduling in gpu datacenters: Taxonomy, challenges and vision W Gao, Q Hu, Z Ye, P Sun, X Wang, Y Luo, T Zhang, Y Wen arXiv preprint arXiv:2205.11913, 2022 | 35 | 2022 |
Characterization of large language model development in the datacenter Q Hu, Z Ye, Z Wang, G Wang, M Zhang, Q Chen, P Sun, D Lin, X Wang, ... 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024 | 32 | 2024 |
Scalable programmable inbound traffic engineering P Sun, L Vanbever, J Rexford Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined Networking …, 2015 | 29 | 2015 |
Broadband achromatic polarization insensitive metalens over 950 nm bandwidth in the visible and near-infrared P Sun, M Zhang, F Dong, L Feng, W Chu Chinese Optics Letters 20 (1), 013601, 2022 | 26 | 2022 |
Lucid: A non-intrusive, scalable and interpretable scheduler for deep learning training jobs Q Hu, M Zhang, P Sun, Y Wen, T Zhang Proceedings of the 28th ACM International Conference on Architectural …, 2023 | 25 | 2023 |
Deep learning workload scheduling in gpu datacenters: A survey Z Ye, W Gao, Q Hu, P Sun, X Wang, Y Luo, T Zhang, Y Wen ACM Computing Surveys 56 (6), 1-38, 2024 | 21 | 2024 |
Elan: Towards generic and efficient elastic training for deep learning L Xie, J Zhai, B Wu, Y Wang, X Zhang, P Sun, S Yan 2020 IEEE 40th International Conference on Distributed Computing Systems …, 2020 | 20 | 2020 |
Timed dataflow: Reducing communication overhead for distributed machine learning systems P Sun, Y Wen, TNB Duong, S Yan 2016 IEEE 22nd International Conference on Parallel and Distributed Systems …, 2016 | 20 | 2016 |
Cloud3DView: An interactive tool for cloud data center operations J Yin, P Sun, Y Wen, H Gong, M Liu, X Li, H You, J Gao, C Lin Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM, 499-500, 2013 | 19 | 2013 |