Shuai Zheng

Citeret af

	Alle	Siden 2020
Henvisninger	1016	918
h-index	14	13
i10-indeks	14	13

280

140

210

20162017201820192020202120222023202420259 18 24 42 76 124 145 260 265 42

Offentlig adgang

Se alle

9 artikler

0 artikler

tilgængelige

ikke tilgængelige

Baseret på krav i forbindelse med finansiering

Følg

Shuai Zheng

Amazon Web Services

Verificeret mail på connect.ust.hk - Startside

Machine Learning Large Language Model Distributed Optimization Distributed System


Titel Sortér efter henvisninger Sortér efter årstal Sortér efter titel	Citeret af Citeret af	År
Gluoncv and gluonnlp: Deep learning in computer vision and natural language processing J Guo, H He, T He, L Lausen, M Li, H Lin, X Shi, C Wang, J Xie, S Zha, ... Journal of Machine Learning Research 21 (23), 1-7, 2020	250	2020
Communication-efficient distributed blockwise momentum SGD with error-feedback S Zheng, Z Huang, J Kwok Advances in Neural Information Processing Systems 32, 2019	150	2019
Alexa teacher model: Pretraining and distilling multi-billion-parameter encoders for natural language understanding systems J FitzGerald, S Ananthakrishnan, K Arkoudas, D Bernardi, A Bhagia, ... Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022	76	2022
Fast-and-Light Stochastic ADMM. S Zheng, JT Kwok IJCAI, 2407-2613, 2016	71	2016
Gemini: Fast failure recovery in distributed training with in-memory checkpoints Z Wang, Z Jia, S Zheng, Z Zhang, X Fu, TSE Ng, Y Wang Proceedings of the 29th Symposium on Operating Systems Principles, 364-381, 2023	56	2023
Partial and asymmetric contrastive learning for out-of-distribution detection in long-tailed recognition H Wang, A Zhang, Y Zhu, S Zheng, M Li, AJ Smola, Z Wang International Conference on Machine Learning, 23446-23458, 2022	54	2022
Removing batch normalization boosts adversarial training H Wang, A Zhang, S Zheng, X Shi, M Li, Z Wang International Conference on Machine Learning, 23433-23445, 2022	52	2022
Cser: Communication-efficient sgd with error reset C Xie, S Zheng, S Koyejo, I Gupta, M Li, H Lin Advances in Neural Information Processing Systems 33, 12593-12603, 2020	47	2020
Asynchronous Distributed Semi-Stochastic Gradient Optimization R Zhang, S Zheng, JT Kwok AAAI, 2323-2329, 2016	46*	2016
MiCS: Near-linear scaling for training gigantic model on public cloud Z Zhang, S Zheng, Y Wang, J Chiu, G Karypis, T Chilimbi, M Li, X Jin arXiv preprint arXiv:2205.00119, 2022	42	2022
Follow the moving leader in deep learning S Zheng, JT Kwok International Conference on Machine Learning, 4110-4119, 2017	31	2017
Prompt pre-training with twenty-thousand classes for open-vocabulary visual recognition S Ren, A Zhang, Y Zhu, S Zhang, S Zheng, M Li, AJ Smola, X Sun Advances in Neural Information Processing Systems 36, 12569-12588, 2023	29	2023
Accelerated large batch optimization of bert pretraining in 54 minutes S Zheng, H Lin, S Zha, M Li arXiv preprint arXiv:2006.13484, 2020	22	2020
Stochastic variance-reduced admm S Zheng, JT Kwok arXiv preprint arXiv:1604.07070, 2016	15	2016
VCC: scaling transformers to 128k tokens or more by prioritizing important tokens Z Zeng, C Hawkins, M Hong, A Zhang, N Pappas, V Singh, S Zheng Advances in Neural Information Processing Systems 36, 20260-20286, 2023	9	2023
Compressed communication for distributed training: Adaptive methods and system Y Zhong, C Xie, S Zheng, H Lin arXiv preprint arXiv:2105.07829, 2021	9	2021
Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data S Zheng, JT Kwok International Conference on Machine Learning, 5932-5940, 2018	8	2018
Lancet: Accelerating mixture-of-experts training via whole graph computation-communication overlapping C Jiang, Y Tian, Z Jia, S Zheng, C Wu, Y Wang Proceedings of Machine Learning and Systems 6, 74-86, 2024	7	2024
{DISTMM}: Accelerating Distributed Multimodal Model Training J Huang, Z Zhang, S Zheng, F Qin, Y Wang 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024	7	2024
Blockwise adaptivity: Faster training and better generalization in deep learning S Zheng, JT Kwok arXiv preprint arXiv:1905.09899, 2019	7	2019

Systemet kan ikke foretage handlingen nu. Prøv igen senere.

Artikler 1–20

Henvisninger pr. år

Dublerede henvisninger

Flettede henvisninger

Tilføj medforfattereMedforfattere

Følg

Citeret af