- Academic Search

S Malladi, T Gao, E Nichani… - Advances in …, 2023 - proceedings.neurips.cc

Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …

保存引用被引用数: 182 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] mlr.press

A zeroth-order block coordinate descent algorithm for huge-scale black-box optimization

HQ Cai, Y Lou, D McKenzie… - … Conference on Machine …, 2021 - proceedings.mlr.press

We consider the zeroth-order optimization problem in the huge-scale setting, where the
dimension of the problem is so large that performing even basic vector operations on the …

保存引用被引用数: 46 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] pnas.org Full View

A Hamilton–Jacobi-based proximal operator

S Osher, H Heaton, S Wu Fung - Proceedings of the …, 2023 - National Acad Sciences

First-order optimization algorithms are widely used today. Two standard building blocks in
these algorithms are proximal operators (proximals) and gradients. Although gradients can …

保存引用被引用数: 15 関連記事全 8 バージョン

[Free GPT-4]

[PDF] arxiv.org

Deepzero: Scaling up zeroth-order optimization for deep model training

A Chen, Y Zhang, J Jia, J Diffenderfer, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Zeroth-order (ZO) optimization has become a popular technique for solving machine
learning (ML) problems when first-order (FO) information is difficult or impossible to obtain …

保存引用被引用数: 34 関連記事全 7 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Revisiting zeroth-order optimization for memory-efficient llm fine-tuning: A benchmark

Y Zhang, P Li, J Hong, J Li, Y Zhang, W Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org

In the evolving landscape of natural language processing (NLP), fine-tuning pre-trained
Large Language Models (LLMs) with first-order (FO) optimizers like SGD and Adam has …

保存引用被引用数: 39 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Zeroth-order algorithms for stochastic distributed nonconvex optimization

X Yi, S Zhang, T Yang, KH Johansson - Automatica, 2022 - Elsevier

In this paper, we consider a stochastic distributed nonconvex optimization problem with the
cost function being distributed over n agents having access only to zeroth-order (ZO) …

保存引用被引用数: 30 関連記事全 7 バージョン

[Free GPT-4]

[PDF] arxiv.org

How to robustify black-box ml models? a zeroth-order optimization perspective

Y Zhang, Y Yao, J Jia, J Yi, M Hong, S Chang… - arxiv preprint arxiv …, 2022 - arxiv.org

The lack of adversarial robustness has been recognized as an important issue for state-of-
the-art machine learning (ML) models, eg, deep neural networks (DNNs). Thereby …

保存引用被引用数: 33 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] openreview.net

DPZero: dimension-independent and differentially private zeroth-order optimization

L Zhang, KK Thekumparampil, S Oh… - International Workshop on …, 2023 - openreview.net

The widespread practice of fine-tuning pretrained large language models (LLMs) on domain-
specific data faces two major challenges in memory and privacy. First, as the size of LLMs …

保存引用被引用数: 9 関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Stochastic zeroth-order Riemannian derivative estimation and optimization

J Li, K Balasubramanian, S Ma - Mathematics of Operations …, 2023 - pubsonline.informs.org

We consider stochastic zeroth-order optimization over Riemannian submanifolds embedded
in Euclidean space, where the task is to solve Riemannian optimization problems with only …

保存引用被引用数: 32 関連記事全 8 バージョン

[Free GPT-4]

[PDF] neurips.cc

Zeroth-order hard-thresholding: gradient error vs. expansivity

W de Vazelhes, H Zhang, H Wu… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract $\ell_0 $ constrained optimization is prevalent in machine learning, particularly for
high-dimensional problems, because it is a fundamental approach to achieve sparse …

保存引用被引用数: 5 関連記事全 9 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Zeroth-order regularized optimization (zoro): Approximately sparse gradients and adaptive sampling

Fine-tuning language models with just forward passes

A zeroth-order block coordinate descent algorithm for huge-scale black-box optimization

A Hamilton–Jacobi-based proximal operator

Deepzero: Scaling up zeroth-order optimization for deep model training

Revisiting zeroth-order optimization for memory-efficient llm fine-tuning: A benchmark

Zeroth-order algorithms for stochastic distributed nonconvex optimization

How to robustify black-box ml models? a zeroth-order optimization perspective

DPZero: dimension-independent and differentially private zeroth-order optimization

Stochastic zeroth-order Riemannian derivative estimation and optimization

Zeroth-order hard-thresholding: gradient error vs. expansivity