Tighter Theory for Local SGD on Identical and Heterogeneous Data A Khaled, K Mishchenko, P Richtárik AISTATS 2020 (arXiv:1909.04746), 2020 | 507 | 2020 |
Better theory for SGD in the nonconvex world A Khaled, P Richtárik TMLR, 2020 | 228 | 2020 |
First analysis of local GD on heterogeneous data A Khaled, K Mishchenko, P Richtárik arXiv preprint arXiv:1909.04715, 2019 | 191 | 2019 |
Random Reshuffling: Simple Analysis with Vast Improvements K Mishchenko, A Khaled, P Richtárik NeurIPS 2020 (arXiv:2006.05988), 2020 | 166 | 2020 |
Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization A Khaled, O Sebbouh, N Loizou, RM Gower, P Richtárik JOTA, 2020 | 53 | 2020 |
Proximal and federated random reshuffling K Mishchenko, A Khaled, P Richtárik International Conference on Machine Learning, 15718-15749, 2022 | 48 | 2022 |
Better Communication Complexity for Local SGD A Khaled, K Mishchenko, P Richtárik arXiv preprint arXiv:1909.04746v1, 2019 | 33 | 2019 |
The road less scheduled A Defazio, X Yang, A Khaled, K Mishchenko, H Mehta, A Cutkosky Advances in Neural Information Processing Systems 37, 9974-10007, 2025 | 30 | 2025 |
Gradient descent with compressed iterates A Khaled, P Richtárik arXiv preprint arXiv:1909.04716, 2019 | 30 | 2019 |
Federated optimization algorithms with random reshuffling and gradient compression A Sadiev, G Malinovsky, E Gorbunov, I Sokolov, A Khaled, K Burlachenko, ... arXiv preprint arXiv:2206.07021, 2022 | 29 | 2022 |
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method A Khaled, K Mishchenko, C Jin NeurIPS 2023, 2023 | 25 | 2023 |
Distributed fixed point methods with compressed iterates S Chraibi, A Khaled, D Kovalev, P Richtárik, A Salim, M Takáč arXiv preprint arXiv:1912.09925, 2019 | 25 | 2019 |
Faster federated optimization under second-order similarity A Khaled, C Jin ICLR 2023, 2022 | 20 | 2022 |
FLIX: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning E Gasanov, A Khaled, S Horváth, P Richtárik AISTATS 2022 (arXiv:2111.11556), 2021 | 20 | 2021 |
Applying fast matrix multiplication to neural networks A Khaled, AF Atiya, AH Abdel-Gawad Proceedings of the 35th Annual ACM Symposium on Applied Computing, 1034-1037, 2020 | 12 | 2020 |
Tuning-Free Stochastic Optimization A Khaled, C Jin ICML 2024, 2024 | 10 | 2024 |
Directional smoothness and gradient methods: Convergence and adaptivity A Mishkin, A Khaled, Y Wang, A Defazio, R Gower Advances in Neural Information Processing Systems 37, 14810-14848, 2025 | 5 | 2025 |
A novel analysis of gradient descent under directional smoothness A Mishkin, A Khaled, A Defazio, RM Gower OPT 2023: Optimization for Machine Learning, 0 | | |