Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

S Liu, PY Chen, B Kailkhura, G Zhang… - IEEE Signal …, 2020 - ieeexplore.ieee.org
Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many
signal processing and machine learning (ML) applications. It is used for solving optimization …

Fine-tuning language models with just forward passes

S Malladi, T Gao, E Nichani… - Advances in …, 2023 - proceedings.neurips.cc
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …

Derivative-free optimization methods

J Larson, M Menickelly, SM Wild - Acta Numerica, 2019 - cambridge.org
In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …

Conditional gradient methods

G Braun, A Carderera, CW Combettes… - arxiv preprint arxiv …, 2022 - arxiv.org
The purpose of this survey is to serve both as a gentle introduction and a coherent overview
of state-of-the-art Frank--Wolfe algorithms, also called conditional gradient algorithms, for …

No-regret learning in time-varying zero-sum games

M Zhang, P Zhao, H Luo… - … Conference on Machine …, 2022 - proceedings.mlr.press
Learning from repeated play in a fixed two-player zero-sum game is a classic problem in
game theory and online learning. We consider a variant of this problem where the game …

Learning the globally optimal distributed LQ regulator

L Furieri, Y Zheng… - Learning for Dynamics …, 2020 - proceedings.mlr.press
We study model-free learning methods for the output-feedback Linear Quadratic (LQ) control
problem in finite-horizon subject to subspace constraints on the control policy. Subspace …

Zero-th order algorithm for softmax attention optimization

Y Deng, Z Li, S Mahadevan… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Large language models (LLMs) have brought about significant transformations in human
society. Among the crucial computations in LLMs, the softmax unit holds great importance …

Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm

A Akhavan, E Chzhen, M Pontil, AB Tsybakov - Journal of Machine …, 2024 - jmlr.org
This work studies minimization problems with zero-order noisy oracle information under the
assumption that the objective function is highly smooth and possibly satisfies additional …

Model-free nonlinear feedback optimization

Z He, S Bolognani, J He, F Dörfler… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Feedback optimization is a control paradigm that enables physical systems to autonomously
reach efficient operating points. Its central idea is to interconnect optimization iterations in …