עקוב אחר
Ke-shi Ge
Ke-shi Ge
School of Computer Science, National University of Defense Technology
כתובת אימייל מאומתת בדומיין nudt.edu.cn
כותרת
צוטט על ידי
צוטט על ידי
שנה
Merak: An efficient distributed dnn training framework with automated 3d parallelism for giant foundation models
Z Lai, S Li, X Tang, K Ge, W Liu, Y Duan, L Qiao, D Li
IEEE Transactions on Parallel and Distributed Systems 34 (5), 1466-1478, 2023
422023
AutoPipe: A fast pipeline parallelism approach with balanced partitioning and micro-batch slicing
W Liu, Z Lai, S Li, Y Duan, K Ge, D Li
2022 IEEE International Conference on Cluster Computing (CLUSTER), 301-312, 2022
212022
HPDL: towards a general framework for high-performance distributed deep learning
D Li, Z Lai, K Ge, Y Zhang, Z Zhang, Q Wang, H Wang
2019 IEEE 39th International Conference on Distributed Computing Systems …, 2019
212019
An efficient ADMM-based algorithm to nonconvex penalized support vector machines
L Guan, L Qiao, D Li, T Sun, K Ge, X Lu
2018 IEEE International Conference on Data Mining Workshops (ICDMW), 1209-1216, 2018
212018
An efficient parallel and distributed solution to nonconvex penalized linear SVMs
L Guan, T Sun, L Qiao, Z Yang, D Li, K Ge, X Lu
Frontiers of Information Technology & Electronic Engineering 21, 587-603, 2020
172020
Hph: Hybrid parallelism on heterogeneous clusters for accelerating large-scale dnns training
Y Duan, Z Lai, S Li, W Liu, K Ge, P Liang, D Li
2022 IEEE International Conference on Cluster Computing (CLUSTER), 313-323, 2022
132022
Efficient parallel implementation of a density peaks clustering algorithm on graphics processing unit
K Ge, H Su, D Li, X Lu
Frontiers of Information Technology & Electronic Engineering 18 (7), 915-927, 2017
92017
Deep discriminative clustering network
X Shaol, K Ge, H Su, L Luo, B Peng, D Li
2018 International Joint Conference on Neural Networks (IJCNN), 1-7, 2018
82018
Prophet: Fine-grained load balancing for parallel training of large-scale moe models
W Wang, Z Lai, S Li, W Liu, K Ge, Y Liu, A Shen, D Li
2023 IEEE International Conference on Cluster Computing (CLUSTER), 82-94, 2023
72023
Accelerate distributed deep learning with cluster-aware sketch quantization
K Ge, Y Zhang, Y Fu, Z Lai, X Deng, D Li
Science China Information Sciences 66 (6), 162102, 2023
52023
Automated tensor model parallelism with overlapped communication for efficient foundation model training
S Li, Z Lai, Y Hao, W Liu, K Ge, X Deng, D Li, K Lu
arXiv preprint arXiv:2305.16121, 2023
52023
Advances of pipeline model parallelism for deep learning training: an overview
L Guan, DS Li, JY Liang, WJ Wang, KS Ge, XC Lu
Journal of Computer Science and Technology 39 (3), 567-584, 2024
42024
S2 reducer: High-performance sparse communication to accelerate distributed deep learning
K Ge, Y Fu, Y Zhang, Z Lai, X Deng, D Li
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
42022
A multidimensional communication scheduling method for hybrid parallel dnn training
S Li, K Lu, Z Lai, W Liu, K Ge, D Li
IEEE Transactions on Parallel and Distributed Systems, 2024
32024
Compressed collective sparse-sketch for distributed data-parallel training of deep learning models
K Ge, K Lu, Y Fu, X Deng, Z Lai, D Li
IEEE Journal on Selected Areas in Communications 41 (4), 941-963, 2023
32023
Auto-divide GNN: Accelerating GNN training with subgraph division
H Chen, Z Ran, K Ge, Z Lai, J Jiang, D Li
European Conference on Parallel Processing, 367-382, 2023
12023
BRGraph: An efficient graph neural network training system by reusing batch data on GPU
K Ge, Z Ran, Z Lai, L Zhang, D Li
Concurrency and Computation: Practice and Experience 34 (15), e6961, 2022
12022
Casq: Accelerate distributed deep learning with sketch-based gradient quantization
K Ge, Y Zhang, Y Fu, Z Lai, X Deng, D Li
2021 IEEE International Conference on Cluster Computing (CLUSTER), 825-826, 2021
12021
Efficient deep neural network training via decreasing precision with layer capacity
A Shen, Z Lai, T Sun, S Li, K Ge, W Liu, D Li
Frontiers of Computer Science 19 (10), 1910355, 2025
2025
AutoPipe-H: A Heterogeneity-Aware Data-Paralleled Pipeline Approach on Commodity GPU Servers
W Liu, K Lu, Z Lai, S Li, K Ge, D Li, X Lu
IEEE Transactions on Computers, 2024
2024
המערכת אינה יכולה לבצע את הפעולה כעת. נסה שוב מאוחר יותר.
מאמרים 1–20