The step decay schedule: A near optimal, geometrically decaying learning rate procedure for least squares
Minimax optimal convergence rates for numerous classes of stochastic convex optimization
problems are well characterized, where the majority of results utilize iterate averaged …
problems are well characterized, where the majority of results utilize iterate averaged …
A machine learning approach for air quality prediction: Model regularization and optimization
In this paper, we tackle air quality forecasting by using machine learning approaches to
predict the hourly concentration of air pollutants (eg, ozone, particle matter (PM 2.5) and …
predict the hourly concentration of air pollutants (eg, ozone, particle matter (PM 2.5) and …
Rsg: Beating subgradient method without smoothness and strong convexity
In this paper, we study the efficiency of a Restarted SubGradient (RSG) method that
periodically restarts the standard subgradient method (SG). We show that, when applied to a …
periodically restarts the standard subgradient method (SG). We show that, when applied to a …
Fast Stochastic AUC Maximization with -Convergence Rate
In this paper, we consider statistical learning with AUC (area under ROC curve)
maximization in the classical stochastic setting where one random data drawn from an …
maximization in the classical stochastic setting where one random data drawn from an …
Advancing non-convex and constrained learning: Challenges and opportunities
T Yang - AI Matters, 2019 - dl.acm.org
As data gets more complex and applications of machine learning (ML) algorithms for
decision-making broaden and diversify, traditional ML methods by minimizing an …
decision-making broaden and diversify, traditional ML methods by minimizing an …
Stagewise training accelerates convergence of testing error over sgd
Stagewise training strategy is widely used for learning neural networks, which runs a
stochastic algorithm (eg, SGD) starting with a relatively large step size (aka learning rate) …
stochastic algorithm (eg, SGD) starting with a relatively large step size (aka learning rate) …
Adapting to function difficulty and growth conditions in private optimization
We develop algorithms for private stochastic convex optimization that adapt to the hardness
of the specific function we wish to optimize. While previous work provide worst-case bounds …
of the specific function we wish to optimize. While previous work provide worst-case bounds …
Universal stagewise learning for non-convex problems with convergence on averaged solutions
Although stochastic gradient descent (SGD) method and its variants (eg, stochastic
momentum methods, AdaGrad) are the choice of algorithms for solving non-convex …
momentum methods, AdaGrad) are the choice of algorithms for solving non-convex …
ADMM without a fixed penalty parameter: Faster convergence with new adaptive penalization
Alternating direction method of multipliers (ADMM) has received tremendous interest for
solving numerous problems in machine learning, statistics and signal processing. However …
solving numerous problems in machine learning, statistics and signal processing. However …
An online method for a class of distributionally robust optimization with non-convex objectives
In this paper, we propose a practical online method for solving a class of distributional robust
optimization (DRO) with non-convex objectives, which has important applications in …
optimization (DRO) with non-convex objectives, which has important applications in …