A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

S Liu, PY Chen, B Kailkhura, G Zhang… - IEEE Signal …, 2020 - ieeexplore.ieee.org
Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many
signal processing and machine learning (ML) applications. It is used for solving optimization …

Fine-tuning language models with just forward passes

S Malladi, T Gao, E Nichani… - Advances in …, 2023 - proceedings.neurips.cc
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …

Derivative-free optimization methods

J Larson, M Menickelly, SM Wild - Acta Numerica, 2019 - cambridge.org
In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …

Instructzero: Efficient instruction optimization for black-box large language models

L Chen, J Chen, T Goldstein, H Huang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models~(LLMs) are instruction followers, but it can be challenging to find
the best instruction for different situations, especially for black-box LLMs on which …

Data-free model extraction

JB Truong, P Maini, RJ Walls… - Proceedings of the …, 2021 - openaccess.thecvf.com
Current model extraction attacks assume that the adversary has access to a surrogate
dataset with characteristics similar to the proprietary data used to train the victim model. This …

Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks

CC Tu, P Ting, PY Chen, S Liu, H Zhang, J Yi… - Proceedings of the AAAI …, 2019 - aaai.org
Recent studies have shown that adversarial examples in state-of-the-art image classifiers
trained by deep neural networks (DNN) can be easily generated when the target model is …

Revisiting zeroth-order optimization for memory-efficient llm fine-tuning: A benchmark

Y Zhang, P Li, J Hong, J Li, Y Zhang, W Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org
In the evolving landscape of natural language processing (NLP), fine-tuning pre-trained
Large Language Models (LLMs) with first-order (FO) optimizers like SGD and Adam has …

Derivative-free methods for policy optimization: Guarantees for linear quadratic systems

D Malik, A Pananjady, K Bhatia, K Khamaru… - Journal of Machine …, 2020 - jmlr.org
We study derivative-free methods for policy optimization over the class of linear policies. We
focus on characterizing the convergence rate of these methods when applied to linear …

Gradient-free methods for deterministic and stochastic nonsmooth nonconvex optimization

T Lin, Z Zheng, M Jordan - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Nonsmooth nonconvex optimization problems broadly emerge in machine learning and
business decision making, whereas two core challenges impede the development of …

Zeroth-order stochastic variance reduction for nonconvex optimization

S Liu, B Kailkhura, PY Chen, P Ting… - Advances in neural …, 2018 - proceedings.neurips.cc
As application demands for zeroth-order (gradient-free) optimization accelerate, the need for
variance reduced and faster converging approaches is also intensifying. This paper …