- Academic Search

X Lian, Y Huang, Y Li, J Liu - Advances in neural …, 2015 - proceedings.neurips.cc

The asynchronous parallel implementations of stochastic gradient (SG) have been broadly
used in solving deep neural network and received many successes in practice recently …

Gem Citer Citeret af 582 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Taming the wild: A unified analysis of hogwild-style algorithms

CM De Sa, C Zhang, K Olukotun… - Advances in neural …, 2015 - proceedings.neurips.cc

Stochastic gradient descent (SGD) is a ubiquitous algorithm for a variety of machine learning
problems. Researchers and industry have developed several techniques to optimize SGD's …

Gem Citer Citeret af 211 Relaterede artikler Alle 16 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Adding vs. averaging in distributed primal-dual optimization

C Ma, V Smith, M Jaggi, M Jordan… - International …, 2015 - proceedings.mlr.press

Distributed optimization methods for large-scale machine learning suffer from a
communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and …

Gem Citer Citeret af 211 Relaterede artikler Alle 17 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Stochastic quasi-gradient methods: Variance reduction via Jacobian sketching

RM Gower, P Richtárik, F Bach - Mathematical Programming, 2021 - Springer

We develop a new family of variance reduced stochastic gradient descent methods for
minimizing the average of a very large number of smooth functions. Our method—JacSketch …

Gem Citer Citeret af 122 Relaterede artikler Alle 14 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A primer on coordinate descent algorithms

HJM Shi, S Tu, Y Xu, W Yin - arxiv preprint arxiv:1610.00040, 2016 - arxiv.org

This monograph presents a class of algorithms called coordinate descent algorithms for
mathematicians, statisticians, and engineers outside the field of optimization. This particular …

Gem Citer Citeret af 126 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A comprehensive linear speedup analysis for asynchronous stochastic parallel optimization from zeroth-order to first-order

X Lian, H Zhang, CJ Hsieh… - Advances in Neural …, 2016 - proceedings.neurips.cc

Asynchronous parallel optimization received substantial successes and extensive attention
recently. One of core theoretical questions is how much speedup (or benefit) the …

Gem Citer Citeret af 129 Relaterede artikler Alle 7 versioner Vis som HTML

A parallel computing approach to solve traffic assignment using path-based gradient projection algorithm

X Chen, Z Liu, K Zhang, Z Wang - Transportation Research Part C …, 2020 - Elsevier

This paper presents a Parallel Block-Coordinate Descent (PBCD) algorithm for solving the
user equilibrium traffic assignment problem. Most of the existing algorithms for the user …

Gem Citer Citeret af 52 Relaterede artikler Alle 3 versioner

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Distributed asynchronous optimization with unbounded delays: How slow can you go?

Z Zhou, P Mertikopoulos, N Bambos… - International …, 2018 - proceedings.mlr.press

One of the most widely used optimization methods for large-scale machine learning
problems is distributed asynchronous stochastic gradient descent (DASGD). However, a key …

Gem Citer Citeret af 72 Relaterede artikler Alle 8 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Distributed multi-task relationship learning

S Liu, SJ Pan, Q Ho - Proceedings of the 23rd ACM SIGKDD …, 2017 - dl.acm.org

Multi-task learning aims to learn multiple tasks jointly by exploiting their relatedness to
improve the generalization performance for each task. Traditionally, to perform multi-task …

Gem Citer Citeret af 91 Relaterede artikler Alle 7 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fastest rates for stochastic mirror descent methods

F Hanzely, P Richtárik - Computational Optimization and Applications, 2021 - Springer

Relative smoothness—a notion introduced in Birnbaum et al.(Proceedings of the 12th ACM
conference on electronic commerce, ACM, pp 127–136, 2011) and recently rediscovered in …

Gem Citer Citeret af 58 Relaterede artikler Alle 18 versioner

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

On the complexity of parallel coordinate descent

Asynchronous parallel stochastic gradient for nonconvex optimization

Taming the wild: A unified analysis of hogwild-style algorithms

Adding vs. averaging in distributed primal-dual optimization

Stochastic quasi-gradient methods: Variance reduction via Jacobian sketching

A primer on coordinate descent algorithms

A comprehensive linear speedup analysis for asynchronous stochastic parallel optimization from zeroth-order to first-order

A parallel computing approach to solve traffic assignment using path-based gradient projection algorithm

Distributed asynchronous optimization with unbounded delays: How slow can you go?

Distributed multi-task relationship learning

Fastest rates for stochastic mirror descent methods