Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent X Lian, C Zhang, H Zhang, CJ Hsieh, W Zhang, J Liu
Advances in neural information processing systems 30, 2017
1425 2017 Asynchronous decentralized parallel stochastic gradient descent X Lian, W Zhang, C Zhang, J Liu
International Conference on Machine Learning, 3043-3052, 2018
601 2018 Asynchronous parallel stochastic gradient for nonconvex optimization X Lian, Y Huang, Y Li, J Liu
Advances in Neural Information Processing Systems, 2737-2745, 2015
583 2015 : Decentralized Training over Decentralized DataH Tang, X Lian, M Yan, C Zhang, J Liu
International Conference on Machine Learning, 4848-4856, 2018
425 2018 Staleness-aware Async-SGD for Distributed Deep Learning W Zhang, S Gupta, X Lian, J Liu
International Joint Conference on Artificial Intelligence, 2016
341 2016 Doublesqueeze: Parallel stochastic gradient descent with double-pass error-compensated compression H Tang, C Yu, X Lian, T Zhang, J Liu
International Conference on Machine Learning, 6155-6165, 2019
282 2019 Douzero: Mastering doudizhu with self-play deep reinforcement learning D Zha, J Xie, W Ma, S Zhang, X Lian, X Hu, J Liu
international conference on machine learning, 12333-12344, 2021
150 2021 A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order X Lian, H Zhang, CJ Hsieh, Y Huang, J Liu
Advances in Neural Information Processing Systems, 2016
132 2016 Finite-sum Composition Optimization via Variance Reduced Gradient Descent X Lian, M Wang, J Liu
Artificial Intelligence and Statistics, 2017
98 2017 1-bit adam: Communication efficient large-scale training with adam’s convergence speed H Tang, S Gan, AA Awan, S Rajbhandari, C Li, X Lian, J Liu, C Zhang, ...
International Conference on Machine Learning, 10118-10129, 2021
95 2021 Asynchronous Parallel Greedy Coordinate Descent Y You*, X Lian*(equal contribution), J Liu, HF Yu, I Dhillon, J Demmel, ...
Advances in Neural Information Processing Systems, 2016
53 2016 Revisit batch normalization: New understanding and refinement via composition optimization X Lian, J Liu
The 22nd International Conference on Artificial Intelligence and Statistics …, 2019
52 2019 Deepsqueeze: Decentralization meets error-compensated compression H Tang, X Lian, S Qiu, L Yuan, C Zhang, T Zhang, J Liu
arXiv preprint arXiv:1907.07346, 2019
46 2019 Efficient smooth non-convex stochastic compositional optimization via stochastic recursive gradient descent W Hu, CJ Li, X Lian, J Liu, H Yuan
Advances in Neural Information Processing Systems 32, 2019
36 2019 Stochastic recursive momentum for policy gradient methods H Yuan, X Lian, J Liu, Y Zhou
arXiv preprint arXiv:2003.04302, 2020
35 2020 Bagua: scaling up distributed learning with system relaxations S Gan, X Lian, R Wang, J Chang, C Liu, H Shi, S Zhang, X Li, T Sun, ...
arXiv preprint arXiv:2107.01499, 2021
32 2021 Persia: An open, hybrid system scaling deep learning-based recommenders up to 100 trillion parameters X Lian, B Yuan, X Zhu, Y Wang, Y He, H Wu, L Sun, H Lyu, C Liu, X Dong, ...
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022
29 2022 Persia: a hybrid system scaling deep learning based recommenders up to 100 trillion parameters X Lian, B Yuan, X Zhu, Y Wang, Y He, H Wu, L Sun, H Lyu, C Liu, X Dong, ...
arXiv preprint arXiv:2111.05897, 2021
14 2021 NMR evidence for field-induced ferromagnetism in ( )OHFeSe superconductor YP Wu, D Zhao, XR Lian, XF Lu, NZ Wang, XG Luo, XH Chen, T Wu
Physical Review B 91 (12), 125107, 2015
12 2015 Stochastic recursive variance reduction for efficient smooth non-convex compositional optimization H Yuan, X Lian, J Liu
arXiv preprint arXiv:1912.13515, 2019
11 2019