Recent advances in convolutional neural network acceleration
In recent years, convolutional neural networks (CNNs) have shown great performance in
various fields such as image classification, pattern recognition, and multi-media …
various fields such as image classification, pattern recognition, and multi-media …
Problem formulations and solvers in linear SVM: a review
Support vector machine (SVM) is an optimal margin based classification technique in
machine learning. SVM is a binary linear classifier which has been extended to non-linear …
machine learning. SVM is a binary linear classifier which has been extended to non-linear …
Federated optimization: Distributed machine learning for on-device intelligence
We introduce a new and increasingly relevant setting for distributed optimization in machine
learning, where the data defining the optimization are unevenly distributed over an …
learning, where the data defining the optimization are unevenly distributed over an …
A survey of optimization methods from a machine learning perspective
Machine learning develops rapidly, which has made many theoretical breakthroughs and is
widely applied in various fields. Optimization, as an important part of machine learning, has …
widely applied in various fields. Optimization, as an important part of machine learning, has …
signSGD: Compressed optimisation for non-convex problems
Training large neural networks requires distributing learning across multiple workers, where
the cost of communicating gradients can be a significant bottleneck. signSGD alleviates this …
the cost of communicating gradients can be a significant bottleneck. signSGD alleviates this …
Coordinate descent algorithms
SJ Wright - Mathematical programming, 2015 - Springer
Coordinate descent algorithms solve optimization problems by successively performing
approximate minimization along coordinate directions or coordinate hyperplanes. They have …
approximate minimization along coordinate directions or coordinate hyperplanes. They have …
Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition
In 1963, Polyak proposed a simple condition that is sufficient to show a global linear
convergence rate for gradient descent. This condition is a special case of the Łojasiewicz …
convergence rate for gradient descent. This condition is a special case of the Łojasiewicz …
Fast matrix factorization for online recommendation with implicit feedback
This paper contributes improvements on both the effectiveness and efficiency of Matrix
Factorization (MF) methods for implicit feedback. We highlight two critical issues of existing …
Factorization (MF) methods for implicit feedback. We highlight two critical issues of existing …
[BOOK][B] First-order methods in optimization
A Beck - 2017 - SIAM
This book, as the title suggests, is about first-order methods, namely, methods that exploit
information on values and gradients/subgradients (but not Hessians) of the functions …
information on values and gradients/subgradients (but not Hessians) of the functions …
A proximal stochastic gradient method with progressive variance reduction
We consider the problem of minimizing the sum of two convex functions: one is the average
of a large number of smooth component functions, and the other is a general convex …
of a large number of smooth component functions, and the other is a general convex …