Adam through a second-order lens

RM Clarke, B Su, JM Hernández-Lobato - arxiv preprint arxiv:2310.14963, 2023 - arxiv.org
Research into optimisation for deep learning is characterised by a tension between the
computational efficiency of first-order, gradient-based methods (such as SGD and Adam) …

A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization

J Wang, A Choromanska - arxiv preprint arxiv:2501.14458, 2025 - arxiv.org
As data sets grow in size and complexity, it is becoming more difficult to pull useful features
from them using hand-crafted feature extractors. For this reason, deep learning (DL) …

Reliability Optimization Using Progressive Batching L-BFGS

M Etesam, GRM Borzadaran - Journal of Reliability …, 2024 - journals.riverpublishers.com
Reliability optimization can be applied to find parameters that increase reliability and
decrease costs, in the presence of uncertainty. Nowadays, with the increasing complexity of …

Pushing the Boundaries of Federated Learning: Super-Linear Convergence and Reinforcement Learning Over Wireless

N Dal Fabbro - 2024 - research.unipd.it
In an age defined by explosive growth in information technology, data generation, storage
and transmission have increased dramatically. This data fuels the core of machine learning …

Evaluation of a novel Momentum Based Quasi-Newton Method in Optimization Problems

AFC Laranjeira - 2024 - search.proquest.com
This work proposes a novel application of momentum to the L-BFGS Quasi-Newton method
and seeks to evaluate its performance when applied to optimization problems. We …

[PDF][PDF] Evaluating Energy-Efficient Device Placement for Distributed Machine Learning

N Davis, A Brown, L Smith, E Wilson, O Miller, S Lopez - researchgate.net
Distributed machine learning tasks often involve complex device interactions that can lead to
high energy consumption. This work investigates optimal device placement strategies to …

[HTML][HTML] 2 Proposed Method

M Etesam, GRM Borzadaran - journals.riverpublishers.com
Stochastic Gradient Descent is an algorithm for optimization. Compared to Batch methods,
there are some motivations for using stochastic methods. It is practical (when the dataset is …

[PDF][PDF] Cross-Domain Comparisons of YOLOv5 Network Optimization Strategies

D Zhao, J Smith, D Ivanov, J Wang, A Petrov, S Volkov - researchgate.net
Abstract The YOLOv5 (You Only Look Once version 5) network has gained attention for its
performance in object detection tasks. This study focuses on optimizing YOLOv5 across …

[PDF][PDF] Real-Time Collaborative Signal Augmentation in Distributed Networks

R Patel, I Chen, L Evans, E Walker, A Mitchell, Z Zhang - researchgate.net
Distributed networks face significant challenges related to latency and bandwidth when it
comes to real-time data sharing and processing. To tackle these issues, we present the Real …