Convergence of adam under relaxed assumptions
In this paper, we provide a rigorous proof of convergence of the Adaptive Moment Estimate
(Adam) algorithm for a wide class of optimization objectives. Despite the popularity and …
(Adam) algorithm for a wide class of optimization objectives. Despite the popularity and …
Generalized-smooth nonconvex optimization is as efficient as smooth nonconvex optimization
Various optimal gradient-based algorithms have been developed for smooth nonconvex
optimization. However, many nonconvex machine learning problems do not belong to the …
optimization. However, many nonconvex machine learning problems do not belong to the …
Federated learning with client subsampling, data heterogeneity, and unbounded smoothness: A new algorithm and lower bounds
We study the problem of Federated Learning (FL) under client subsampling and data
heterogeneity with an objective function that has potentially unbounded smoothness. This …
heterogeneity with an objective function that has potentially unbounded smoothness. This …
Adam-family methods for nonsmooth optimization with convergence guarantees
N ** and communication compression
Achieving communication efficiency in decentralized machine learning has been attracting
significant attention, with communication compression recognized as an effective technique …
significant attention, with communication compression recognized as an effective technique …
Gradient-variation online learning under generalized smoothness
Gradient-variation online learning aims to achieve regret guarantees that scale with
variations in the gradients of online functions, which has been shown to be crucial for …
variations in the gradients of online functions, which has been shown to be crucial for …
Error Feedback under -Smoothness: Normalization and Momentum
We provide the first proof of convergence for normalized error feedback algorithms across a
wide range of machine learning problems. Despite their popularity and efficiency in training …
wide range of machine learning problems. Despite their popularity and efficiency in training …