Evaluating and Improving Compositional Text-to-Visual Generation

B Li, Z Lin, D Pathak, J Li, Y Fei, K Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
While text-to-visual models now produce photo-realistic images and videos they struggle
with compositional text prompts involving attributes relationships and higher-order …

Dreambench++: A human-aligned benchmark for personalized image generation

Y Peng, Y Cui, H Tang, Z Qi, R Dong, J Bai… - arxiv preprint arxiv …, 2024 - arxiv.org
Personalized image generation holds great promise in assisting humans in everyday work
and life due to its impressive function in creatively generating personalized content …

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint

S Chen, T Ye, K Zhang, Z **ng, Y Lin, L Zhu - European Conference on …, 2024 - Springer
Recent advancements in adverse weather restoration have shown potential, yet the
unpredictable and varied combinations of weather degradations in the real world pose …

Do we really need a complex agent system? distill embodied agent into a single model

Z Zhao, K Ma, W Chai, X Wang, K Chen, D Guo… - arxiv preprint arxiv …, 2024 - arxiv.org
With the power of large language models (LLMs), open-ended embodied agents can flexibly
understand human instructions, generate interpretable guidance strategies, and output …

Learning Diffusion Texture Priors for Image Restoration

T Ye, S Chen, W Chai, Z **ng, J Qin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion Models have shown remarkable performance in image generation tasks which are
capable of generating diverse and realistic image content. When adopting diffusion models …

ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

L Eyring, S Karthik, K Roth, A Dosovitskiy… - arxiv preprint arxiv …, 2024 - arxiv.org
Text-to-Image (T2I) models have made significant advancements in recent years, but they
still struggle to accurately capture intricate details specified in complex compositional …

GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation

B Li, Z Lin, D Pathak, J Li, Y Fei, K Wu, T Ling… - arxiv preprint arxiv …, 2024 - arxiv.org
While text-to-visual models now produce photo-realistic images and videos, they struggle
with compositional text prompts involving attributes, relationships, and higher-order …

Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation

X **ng, A Saha, J He, S Hao, P Vicol, M Ryu… - arxiv preprint arxiv …, 2025 - arxiv.org
Text-to-image (T2I) generation has made significant advances in recent years, but
challenges still remain in the generation of perceptual artifacts, misalignment with complex …

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

S Yoon, H Hwang, D Kwon, YK Noh… - arxiv preprint arxiv …, 2024 - arxiv.org
We present a maximum entropy inverse reinforcement learning (IRL) approach for improving
the sample quality of diffusion generative models, especially when the number of generation …

A Novel Scheme for Managing Multiple Context Transitions While Ensuring Consistency in Text-to-Image Generative Artificial Intelligence

H Kim, JH Choi, JY Choi - IEEE Access, 2024 - ieeexplore.ieee.org
Humans possess an astonishing ability to understand stories presented in text and to create
related images through imagination. This cognitive ability aids in comprehension and …