- Academic Search

B Recht - Annual Review of Control, Robotics, and Autonomous …, 2019 - annualreviews.org

This article surveys reinforcement learning from the perspective of optimization and control,
with a focus on continuous control applications. It reviews the general formulation …

Save Cite Cited by 794 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

S Liu, PY Chen, B Kailkhura, G Zhang… - IEEE Signal …, 2020 - ieeexplore.ieee.org

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many
signal processing and machine learning (ML) applications. It is used for solving optimization …

Save Cite Cited by 246 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Fine-tuning language models with just forward passes

S Malladi, T Gao, E Nichani… - Advances in …, 2023 - proceedings.neurips.cc

Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …

Save Cite Cited by 176 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nowpublishers.com

Introduction to online convex optimization

E Hazan - Foundations and Trends® in Optimization, 2016 - nowpublishers.com

This monograph portrays optimization as a process. In many practical applications the
environment is so complex that it is infeasible to lay out a comprehensive theoretical model …

Save Cite Cited by 2196 Related articles All 17 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] arxiv.org

Hopskipjumpattack: A query-efficient decision-based attack

J Chen, MI Jordan… - 2020 ieee symposium on …, 2020 - ieeexplore.ieee.org

The goal of a decision-based adversarial attack on a trained model is to generate
adversarial examples based solely on observing output labels returned by the targeted …

Save Cite Cited by 839 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] nowpublishers.com

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

S Bubeck, N Cesa-Bianchi - Foundations and Trends® in …, 2012 - nowpublishers.com

Multi-armed bandit problems are the most basic examples of sequential decision problems
with an exploration-exploitation trade-off. This is the balance between staying with the option …

Save Cite Cited by 3281 Related articles All 26 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Save Cite Cited by 1249 Related articles All 7 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] nowpublishers.com

Online learning and online convex optimization

S Shalev-Shwartz - Foundations and Trends® in Machine …, 2012 - nowpublishers.com

Online learning is a well established learning paradigm which has both theoretical and
practical appeals. The goal of online learning is to make a sequence of accurate predictions …

Save Cite Cited by 2637 Related articles All 19 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] arxiv.org

The statistical complexity of interactive decision making

DJ Foster, SM Kakade, J Qian, A Rakhlin - arxiv preprint arxiv:2112.13487, 2021 - arxiv.org

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

Save Cite Cited by 204 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nsf.gov

Private empirical risk minimization: Efficient algorithms and tight error bounds

R Bassily, A Smith, A Thakurta - 2014 IEEE 55th annual …, 2014 - ieeexplore.ieee.org

Convex empirical risk minimization is a basic tool in machine learning and statistics. We
provide new algorithms and matching lower bounds for differentially private convex …

Save Cite Cited by 1150 Related articles All 9 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Online convex optimization in the bandit setting: gradient descent without a gradient

A tour of reinforcement learning: The view from continuous control

A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

Fine-tuning language models with just forward passes

Introduction to online convex optimization

Hopskipjumpattack: A query-efficient decision-based attack

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

Introduction to multi-armed bandits

Online learning and online convex optimization

The statistical complexity of interactive decision making

Private empirical risk minimization: Efficient algorithms and tight error bounds