Toward a theoretical foundation of policy optimization for learning control policies
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …
diverse application domains. Recently, there has been a renewed interest in studying …
A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications
Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many
signal processing and machine learning (ML) applications. It is used for solving optimization …
signal processing and machine learning (ML) applications. It is used for solving optimization …
Fine-tuning language models with just forward passes
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …
Derivative-free optimization methods
In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …
applications, objective and constraint functions are available only as the output of a black …
Conditional gradient methods
G Braun, A Carderera, CW Combettes… - arxiv preprint arxiv …, 2022 - arxiv.org
The purpose of this survey is to serve both as a gentle introduction and a coherent overview
of state-of-the-art Frank--Wolfe algorithms, also called conditional gradient algorithms, for …
of state-of-the-art Frank--Wolfe algorithms, also called conditional gradient algorithms, for …
No-regret learning in time-varying zero-sum games
Learning from repeated play in a fixed two-player zero-sum game is a classic problem in
game theory and online learning. We consider a variant of this problem where the game …
game theory and online learning. We consider a variant of this problem where the game …
Learning the globally optimal distributed LQ regulator
We study model-free learning methods for the output-feedback Linear Quadratic (LQ) control
problem in finite-horizon subject to subspace constraints on the control policy. Subspace …
problem in finite-horizon subject to subspace constraints on the control policy. Subspace …
Zero-th order algorithm for softmax attention optimization
Large language models (LLMs) have brought about significant transformations in human
society. Among the crucial computations in LLMs, the softmax unit holds great importance …
society. Among the crucial computations in LLMs, the softmax unit holds great importance …
Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm
This work studies minimization problems with zero-order noisy oracle information under the
assumption that the objective function is highly smooth and possibly satisfies additional …
assumption that the objective function is highly smooth and possibly satisfies additional …
Model-free nonlinear feedback optimization
Feedback optimization is a control paradigm that enables physical systems to autonomously
reach efficient operating points. Its central idea is to interconnect optimization iterations in …
reach efficient operating points. Its central idea is to interconnect optimization iterations in …