الباحث العلمي من Google

S Meyn - 2022‏ - books.google.com‏

A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …‏

حفظ اقتباس تم اقتباسها في عدد: 158 مقالات ذات صلة الإصدارات الـ 3كلها بحث عن المكتبات

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

On the optimization of deep networks: Implicit acceleration by overparameterization‏

S Arora, N Cohen, E Hazan - International conference on …, 2018‏ - proceedings.mlr.press‏

Conventional wisdom in deep learning states that increasing depth improves
expressiveness but complicates optimization. This paper suggests that, sometimes …‏

حفظ اقتباس تم اقتباسها في عدد: 584 مقالات ذات صلة الإصدارات الـ 13كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Acceleration methods‏

A d'Aspremont, D Scieur, A Taylor - Foundations and Trends® …, 2021‏ - nowpublishers.com‏

This monograph covers some recent advances in a range of acceleration techniques
frequently used in convex optimization. We first use quadratic optimization problems to …‏

حفظ اقتباس تم اقتباسها في عدد: 168 مقالات ذات صلة الإصدارات الـ 8كلها بحث عن المكتبات إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks‏

P Chaudhari, S Soatto - 2018 Information Theory and …, 2018‏ - ieeexplore.ieee.org‏

Stochastic gradient descent (SGD) is widely believed to perform implicit regularization when
used to train deep neural networks, but the precise manner in which this occurs has thus far …‏

حفظ اقتباس تم اقتباسها في عدد: 366 مقالات ذات صلة الإصدارات الـ 9كلها

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient‏

AS Dalalyan, A Karagulyan - Stochastic Processes and their Applications, 2019‏ - Elsevier‏

In this paper, we study the problem of sampling from a given probability density function that
is known to be smooth and strongly log-concave. We analyze several methods of …‏

حفظ اقتباس تم اقتباسها في عدد: 358 مقالات ذات صلة الإصدارات الـ 11كلها

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Understanding the acceleration phenomenon via high-resolution differential equations‏

B Shi, SS Du, MI Jordan, WJ Su - Mathematical Programming, 2022‏ - Springer‏

Gradient-based optimization algorithms can be studied from the perspective of limiting
ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not …‏

حفظ اقتباس تم اقتباسها في عدد: 295 مقالات ذات صلة الإصدارات الـ 14كلها

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Underdamped Langevin MCMC: A non-asymptotic analysis‏

X Cheng, NS Chatterji, PL Bartlett… - … on learning theory, 2018‏ - proceedings.mlr.press‏

We study the underdamped Langevin diffusion when the log of the target distribution is
smooth and strongly concave. We present a MCMC algorithm based on its discretization and …‏

حفظ اقتباس تم اقتباسها في عدد: 364 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Acceleration by stepsize hedging: Multi-step descent and the silver stepsize schedule‏

J Altschuler, P Parrilo - Journal of the ACM, 2023‏ - dl.acm.org‏

Can we accelerate the convergence of gradient descent without changing the algorithm—
just by judiciously choosing stepsizes? Surprisingly, we show that the answer is yes. Our …‏

حفظ اقتباس تم اقتباسها في عدد: 33 مقالات ذات صلة الإصدارات الـ 2كلها

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Accelerated gradient descent escapes saddle points faster than gradient descent‏

C **, P Netrapalli, MI Jordan - Conference On Learning …, 2018‏ - proceedings.mlr.press‏

Nesterov's accelerated gradient descent (AGD), an instance of the general family of
“momentum methods,” provably achieves faster convergence rate than gradient descent …‏

حفظ اقتباس تم اقتباسها في عدد: 297 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Stochastic modified equations and adaptive stochastic gradient algorithms‏

Q Li, C Tai, E Weinan - International Conference on Machine …, 2017‏ - proceedings.mlr.press‏

We develop the method of stochastic modified equations (SME), in which stochastic gradient
algorithms are approximated in the weak sense by continuous-time stochastic differential …‏

حفظ اقتباس تم اقتباسها في عدد: 335 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

A variational perspective on accelerated methods in optimization

[كتاب][B] Control systems and reinforcement learning‏

On the optimization of deep networks: Implicit acceleration by overparameterization‏

Acceleration methods‏

Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks‏

[HTML][HTML] User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient‏

Understanding the acceleration phenomenon via high-resolution differential equations‏

Underdamped Langevin MCMC: A non-asymptotic analysis‏

Acceleration by stepsize hedging: Multi-step descent and the silver stepsize schedule‏

Accelerated gradient descent escapes saddle points faster than gradient descent‏

Stochastic modified equations and adaptive stochastic gradient algorithms‏