- Academic Search

B Recht - Annual Review of Control, Robotics, and Autonomous …, 2019 - annualreviews.org

This article surveys reinforcement learning from the perspective of optimization and control,
with a focus on continuous control applications. It reviews the general formulation …

Uložit Citovat Počet citací tohoto článku: 799 Související články Všechny verze (počet: 5)

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

S Liu, PY Chen, B Kailkhura, G Zhang… - IEEE Signal …, 2020 - ieeexplore.ieee.org

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many
signal processing and machine learning (ML) applications. It is used for solving optimization …

Uložit Citovat Počet citací tohoto článku: 256 Související články Všechny verze (počet: 7)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Fine-tuning language models with just forward passes

S Malladi, T Gao, E Nichani… - Advances in …, 2023 - proceedings.neurips.cc

Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …

Uložit Citovat Počet citací tohoto článku: 187 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The statistical complexity of interactive decision making

DJ Foster, SM Kakade, J Qian, A Rakhlin - arxiv preprint arxiv:2112.13487, 2021 - arxiv.org

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

Uložit Citovat Počet citací tohoto článku: 207 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hopskipjumpattack: A query-efficient decision-based attack

J Chen, MI Jordan… - 2020 ieee symposium on …, 2020 - ieeexplore.ieee.org

The goal of a decision-based adversarial attack on a trained model is to generate
adversarial examples based solely on observing output labels returned by the targeted …

Uložit Citovat Počet citací tohoto článku: 845 Související články Všechny verze (počet: 8)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Derivative-free optimization methods

J Larson, M Menickelly, SM Wild - Acta Numerica, 2019 - cambridge.org

In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …

Uložit Citovat Počet citací tohoto článku: 512 Související články Všechny verze (počet: 9)

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Efficient decision-based black-box adversarial attacks on face recognition

Y Dong, H Su, B Wu, Z Li, W Liu… - proceedings of the …, 2019 - openaccess.thecvf.com

Face recognition has obtained remarkable progress in recent years due to the great
improvement of deep convolutional neural networks (CNNs). However, deep CNNs are …

Uložit Citovat Počet citací tohoto článku: 493 Související články Všechny verze (počet: 12) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Uložit Citovat Počet citací tohoto článku: 1257 Související články Všechny verze (počet: 7) Hledat knihovnu Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Global convergence of policy gradient methods for the linear quadratic regulator

M Fazel, R Ge, S Kakade… - … conference on machine …, 2018 - proceedings.mlr.press

Direct policy gradient methods for reinforcement learning and continuous control problems
are a popular approach for a variety of reasons: 1) they are easy to implement without …

Uložit Citovat Počet citací tohoto článku: 722 Související články Všechny verze (počet: 9) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Optimal stochastic non-smooth non-convex optimization through online-to-non-convex conversion

A Cutkosky, H Mehta… - … Conference on Machine …, 2023 - proceedings.mlr.press

We present new algorithms for optimizing non-smooth, non-convex stochastic objectives
based on a novel analysis technique. This improves the current best-known complexity for …

Uložit Citovat Počet citací tohoto článku: 41 Související články Všechny verze (počet: 8) Zobrazit jako HTML

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Online convex optimization in the bandit setting: gradient descent without a gradient

A tour of reinforcement learning: The view from continuous control

A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

Fine-tuning language models with just forward passes

The statistical complexity of interactive decision making

Hopskipjumpattack: A query-efficient decision-based attack

Derivative-free optimization methods

Efficient decision-based black-box adversarial attacks on face recognition

Introduction to multi-armed bandits

Global convergence of policy gradient methods for the linear quadratic regulator

Optimal stochastic non-smooth non-convex optimization through online-to-non-convex conversion