The step decay schedule: A near optimal, geometrically decaying learning rate procedure for least squares

R Ge, SM Kakade, R Kidambi… - Advances in neural …, 2019 - proceedings.neurips.cc
Minimax optimal convergence rates for numerous classes of stochastic convex optimization
problems are well characterized, where the majority of results utilize iterate averaged …

A machine learning approach for air quality prediction: Model regularization and optimization

D Zhu, C Cai, T Yang, X Zhou - Big data and cognitive computing, 2018 - mdpi.com
In this paper, we tackle air quality forecasting by using machine learning approaches to
predict the hourly concentration of air pollutants (eg, ozone, particle matter (PM 2.5) and …

Rsg: Beating subgradient method without smoothness and strong convexity

T Yang, Q Lin - Journal of Machine Learning Research, 2018 - jmlr.org
In this paper, we study the efficiency of a Restarted SubGradient (RSG) method that
periodically restarts the standard subgradient method (SG). We show that, when applied to a …

Fast Stochastic AUC Maximization with -Convergence Rate

M Liu, X Zhang, Z Chen, X Wang… - … on Machine Learning, 2018 - proceedings.mlr.press
In this paper, we consider statistical learning with AUC (area under ROC curve)
maximization in the classical stochastic setting where one random data drawn from an …

Advancing non-convex and constrained learning: Challenges and opportunities

T Yang - AI Matters, 2019 - dl.acm.org
As data gets more complex and applications of machine learning (ML) algorithms for
decision-making broaden and diversify, traditional ML methods by minimizing an …

Stagewise training accelerates convergence of testing error over sgd

Z Yuan, Y Yan, R **, T Yang - Advances in Neural …, 2019 - proceedings.neurips.cc
Stagewise training strategy is widely used for learning neural networks, which runs a
stochastic algorithm (eg, SGD) starting with a relatively large step size (aka learning rate) …

Adapting to function difficulty and growth conditions in private optimization

H Asi, D Lévy, JC Duchi - Advances in Neural Information …, 2021 - proceedings.neurips.cc
We develop algorithms for private stochastic convex optimization that adapt to the hardness
of the specific function we wish to optimize. While previous work provide worst-case bounds …

Universal stagewise learning for non-convex problems with convergence on averaged solutions

Z Chen, Z Yuan, J Yi, B Zhou, E Chen… - arxiv preprint arxiv …, 2018 - arxiv.org
Although stochastic gradient descent (SGD) method and its variants (eg, stochastic
momentum methods, AdaGrad) are the choice of algorithms for solving non-convex …

ADMM without a fixed penalty parameter: Faster convergence with new adaptive penalization

Y Xu, M Liu, Q Lin, T Yang - Advances in neural information …, 2017 - proceedings.neurips.cc
Alternating direction method of multipliers (ADMM) has received tremendous interest for
solving numerous problems in machine learning, statistics and signal processing. However …

An online method for a class of distributionally robust optimization with non-convex objectives

Q Qi, Z Guo, Y Xu, R **, T Yang - Advances in Neural …, 2021 - proceedings.neurips.cc
In this paper, we propose a practical online method for solving a class of distributional robust
optimization (DRO) with non-convex objectives, which has important applications in …