- Academic Search

SH Lee, Y Li, J Ke, I Yoo, H Zhang, J Yu… - … on Computer Vision, 2024 - Springer

Recent works have demonstrated that using reinforcement learning (RL) with multiple
quality rewards can improve the quality of generated images in text-to-image (T2I) …

Speichern Zitieren Zitiert von: 16 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] arxiv.org

Preference tuning with human feedback on language, speech, and vision tasks: A survey

GI Winata, H Zhao, A Das, W Tang, DD Yao… - arxiv preprint arxiv …, 2024 - arxiv.org

Preference tuning is a crucial process for aligning deep generative models with human
preferences. This survey offers a thorough overview of recent advancements in preference …

Speichern Zitieren Zitiert von: 5 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Scalable ranked preference optimization for text-to-image generation

S Karthik, H Coskun, Z Akata, S Tulyakov, J Ren… - arxiv preprint arxiv …, 2024 - arxiv.org

Direct Preference Optimization (DPO) has emerged as a powerful approach to align text-to-
image (T2I) models with human feedback. Unfortunately, successful application of DPO to …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Itercomp: Iterative composition-aware feedback learning from model gallery for text-to-image generation

X Zhang, L Yang, G Li, Y Cai, J **e, Y Tang… - arxiv preprint arxiv …, 2024 - arxiv.org

Advanced diffusion models like RPG, Stable Diffusion 3 and FLUX have made notable
strides in compositional text-to-image generation. However, these methods typically exhibit …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Comfygen: Prompt-adaptive workflows for text-to-image generation

R Gal, A Haviv, Y Alaluf, AH Bermano… - arxiv preprint arxiv …, 2024 - arxiv.org

The practical use of text-to-image generation has evolved from simple, monolithic models to
complex workflows that combine multiple specialized components. While workflow-based …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Avoiding mode collapse in diffusion models fine-tuned with reinforcement learning

R Barceló, C Alcázar, F Tobar - arxiv preprint arxiv:2410.08315, 2024 - arxiv.org

Fine-tuning foundation models via reinforcement learning (RL) has proven promising for
aligning to downstream objectives. In the case of diffusion models (DMs), though RL training …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

L Eyring, S Karthik, K Roth, A Dosovitskiy… - arxiv preprint arxiv …, 2024 - arxiv.org

Text-to-Image (T2I) models have made significant advancements in recent years, but they
still struggle to accurately capture intricate details specified in complex compositional …

Speichern Zitieren Zitiert von: 7 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Aligning Few-Step Diffusion Models with Dense Reward Difference Learning

Z Zhang, L Shen, S Zhang, D Ye, Y Luo, M Shi… - arxiv preprint arxiv …, 2024 - arxiv.org

Aligning diffusion models with downstream objectives is essential for their practical
applications. However, standard alignment methods often struggle with step generalization …

Speichern Zitieren Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Calibrated Multi-Preference Optimization for Aligning Diffusion Models

K Lee, X Li, Q Wang, J He, J Ke, MH Yang… - arxiv preprint arxiv …, 2025 - arxiv.org

Aligning text-to-image (T2I) diffusion models with preference optimization is valuable for
human-annotated datasets, but the heavy cost of manual data collection limits scalability …

Speichern Zitieren Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Z Jia, Y Nan, H Zhao, G Liu - arxiv preprint arxiv:2411.15247, 2024 - arxiv.org

Recent research has shown that fine-tuning diffusion models (DMs) with arbitrary rewards,
including non-differentiable ones, is feasible with reinforcement learning (RL) techniques …

Speichern Zitieren Ähnliche Artikel Alle 2 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Prdp: Proximal reward difference prediction for large-scale reward finetuning of diffusion models

Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation

Preference tuning with human feedback on language, speech, and vision tasks: A survey

Scalable ranked preference optimization for text-to-image generation

Itercomp: Iterative composition-aware feedback learning from model gallery for text-to-image generation

Comfygen: Prompt-adaptive workflows for text-to-image generation

Avoiding mode collapse in diffusion models fine-tuned with reinforcement learning

ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

Aligning Few-Step Diffusion Models with Dense Reward Difference Learning

Calibrated Multi-Preference Optimization for Aligning Diffusion Models

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward