Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation
Recent works have demonstrated that using reinforcement learning (RL) with multiple
quality rewards can improve the quality of generated images in text-to-image (T2I) …
quality rewards can improve the quality of generated images in text-to-image (T2I) …
Preference tuning with human feedback on language, speech, and vision tasks: A survey
Preference tuning is a crucial process for aligning deep generative models with human
preferences. This survey offers a thorough overview of recent advancements in preference …
preferences. This survey offers a thorough overview of recent advancements in preference …
Scalable ranked preference optimization for text-to-image generation
Direct Preference Optimization (DPO) has emerged as a powerful approach to align text-to-
image (T2I) models with human feedback. Unfortunately, successful application of DPO to …
image (T2I) models with human feedback. Unfortunately, successful application of DPO to …
Itercomp: Iterative composition-aware feedback learning from model gallery for text-to-image generation
Advanced diffusion models like RPG, Stable Diffusion 3 and FLUX have made notable
strides in compositional text-to-image generation. However, these methods typically exhibit …
strides in compositional text-to-image generation. However, these methods typically exhibit …
Comfygen: Prompt-adaptive workflows for text-to-image generation
The practical use of text-to-image generation has evolved from simple, monolithic models to
complex workflows that combine multiple specialized components. While workflow-based …
complex workflows that combine multiple specialized components. While workflow-based …
Avoiding mode collapse in diffusion models fine-tuned with reinforcement learning
R Barceló, C Alcázar, F Tobar - arxiv preprint arxiv:2410.08315, 2024 - arxiv.org
Fine-tuning foundation models via reinforcement learning (RL) has proven promising for
aligning to downstream objectives. In the case of diffusion models (DMs), though RL training …
aligning to downstream objectives. In the case of diffusion models (DMs), though RL training …
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
Text-to-Image (T2I) models have made significant advancements in recent years, but they
still struggle to accurately capture intricate details specified in complex compositional …
still struggle to accurately capture intricate details specified in complex compositional …
Aligning Few-Step Diffusion Models with Dense Reward Difference Learning
Aligning diffusion models with downstream objectives is essential for their practical
applications. However, standard alignment methods often struggle with step generalization …
applications. However, standard alignment methods often struggle with step generalization …
Calibrated Multi-Preference Optimization for Aligning Diffusion Models
Aligning text-to-image (T2I) diffusion models with preference optimization is valuable for
human-annotated datasets, but the heavy cost of manual data collection limits scalability …
human-annotated datasets, but the heavy cost of manual data collection limits scalability …
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Recent research has shown that fine-tuning diffusion models (DMs) with arbitrary rewards,
including non-differentiable ones, is feasible with reinforcement learning (RL) techniques …
including non-differentiable ones, is feasible with reinforcement learning (RL) techniques …