AUC maximization in the era of big data and AI: A survey

T Yang, Y Ying - ACM computing surveys, 2022 - dl.acm.org
Area under the ROC curve, aka AUC, is a measure of choice for assessing the performance
of a classifier for imbalanced data. AUC maximization refers to a learning paradigm that …

Optimization for deep learning: An overview

RY Sun - Journal of the Operations Research Society of China, 2020 - Springer
Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …

Convergence of adam under relaxed assumptions

H Li, A Rakhlin, A Jadbabaie - Advances in Neural …, 2023 - proceedings.neurips.cc
In this paper, we provide a rigorous proof of convergence of the Adaptive Moment Estimate
(Adam) algorithm for a wide class of optimization objectives. Despite the popularity and …

Scaffold: Stochastic controlled averaging for federated learning

SP Karimireddy, S Kale, M Mohri… - International …, 2020 - proceedings.mlr.press
Federated learning is a key scenario in modern large-scale machine learning where the
data remains distributed over a large number of clients and the task is to learn a centralized …

[PDF][PDF] Communication-Efficient Stochastic Gradient Descent Ascent with Momentum Algorithms.

Y Zhang, M Qiu, H Gao - IJCAI, 2023 - ijcai.org
Numerous machine learning models can be formulated as a stochastic minimax optimization
problem, such as imbalanced data classification with AUC maximization. Develo** …

Faster adaptive federated learning

X Wu, F Huang, Z Hu, H Huang - … of the AAAI conference on artificial …, 2023 - ojs.aaai.org
Federated learning has attracted increasing attention with the emergence of distributed data.
While extensive federated learning algorithms have been proposed for the non-convex …

Lower bounds for non-convex stochastic optimization

Y Arjevani, Y Carmon, JC Duchi, DJ Foster… - Mathematical …, 2023 - Springer
We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ)
using stochastic first-order methods. In a well-studied model where algorithms access …

Momentum-based variance reduction in non-convex sgd

A Cutkosky, F Orabona - Advances in neural information …, 2019 - proceedings.neurips.cc
Variance reduction has emerged in recent years as a strong competitor to stochastic
gradient descent in non-convex problems, providing the first algorithms to improve upon the …

Provably faster algorithms for bilevel optimization

J Yang, K Ji, Y Liang - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Bilevel optimization has been widely applied in many important machine learning
applications such as hyperparameter optimization and meta-learning. Recently, several …

Only train once: A one-shot neural network training and pruning framework

T Chen, B Ji, T Ding, B Fang, G Wang… - Advances in …, 2021 - proceedings.neurips.cc
Structured pruning is a commonly used technique in deploying deep neural networks
(DNNs) onto resource-constrained devices. However, the existing pruning methods are …