Adam can converge without any modification on update rules

Y Zhang, C Chen, N Shi, R Sun… - Advances in neural …, 2022 - proceedings.neurips.cc
Ever since\citet {reddi2019convergence} pointed out the divergence issue of Adam, many
new variants have been designed to obtain convergence. However, vanilla Adam remains …

The power of adaptivity in sgd: Self-tuning step sizes with unbounded gradients and affine variance

M Faw, I Tziotis, C Caramanis… - … on Learning Theory, 2022 - proceedings.mlr.press
We study convergence rates of AdaGrad-Norm as an exemplar of adaptive stochastic
gradient methods (SGD), where the step sizes change based on observed stochastic …

Efficiency of federated learning and blockchain in preserving privacy and enhancing the performance of credit card fraud detection (CCFD) systems

T Baabdullah, A Alzahrani, DB Rawat, C Liu - Future Internet, 2024 - mdpi.com
Increasing global credit card usage has elevated it to a preferred payment method for daily
transactions, underscoring its significance in global financial cybersecurity. This paper …

[HTML][HTML] Predictive patterns and market efficiency: A deep learning approach to financial time series forecasting

DB Vuković, SD Radenković, I Simeunović, V Zinovev… - Mathematics, 2024 - mdpi.com
This study explores market efficiency and behavior by integrating key theories such as the
Efficient Market Hypothesis (EMH), Adaptive Market Hypothesis (AMH), Informational …

On the convergence of adam under non-uniform smoothness: Separability from sgdm and beyond

B Wang, H Zhang, Q Meng, R Sun, ZM Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper aims to clearly distinguish between Stochastic Gradient Descent with Momentum
(SGDM) and Adam in terms of their convergence rates. We demonstrate that Adam achieves …

Shuffling momentum gradient algorithm for convex optimization

TH Tran, Q Tran-Dinh, LM Nguyen - Vietnam Journal of Mathematics, 2024 - Springer
Abstract The Stochastic Gradient Method (SGD) and its stochastic variants have become
methods of choice for solving finite-sum optimization problems arising from machine …

Acceleration of stochastic gradient descent with momentum by averaging: finite-sample rates and asymptotic normality

K Tang, W Liu, Y Zhang, X Chen - arxiv preprint arxiv:2305.17665, 2023 - arxiv.org
Stochastic gradient descent with momentum (SGDM) has been widely used in many
machine learning and statistical applications. Despite the observed empirical benefits of …

Revisit last-iterate convergence of mSGD under milder requirement on step size

X He, L Chen, D Cheng… - Advances in Neural …, 2022 - proceedings.neurips.cc
Understanding convergence of SGD-based optimization algorithms can help deal with
enormous machine learning problems. To ensure last-iterate convergence of SGD and …

Revisiting the central limit theorems for the sgd-type methods

T Li, T **ao, G Yang - arxiv preprint arxiv:2207.11755, 2022 - arxiv.org
We revisited the central limit theorem (CLT) for stochastic gradient descent (SGD) type
methods, including the vanilla SGD, momentum SGD and Nesterov accelerated SGD …

On stationary point convergence of ppo-clip

R **, S Li, B Wang - The Twelfth International Conference on …, 2023 - openreview.net
Proximal policy optimization (PPO) has gained popularity in reinforcement learning (RL). Its
PPO-Clip variant is one the most frequently implemented algorithms and is one of the first-to …