Mastering text-to-image diffusion: Recaptioning, planning, and generating with multimodal llms

L Yang, Z Yu, C Meng, M Xu, S Ermon… - Forty-first International …, 2024 - openreview.net
Diffusion models have exhibit exceptional performance in text-to-image generation and
editing. However, existing methods often face challenges when handling complex text …

Retrieval-augmented generation for ai-generated content: A survey

P Zhao, H Zhang, Q Yu, Z Wang, Y Geng, F Fu… - arxiv preprint arxiv …, 2024 - arxiv.org
The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by
advancements in model algorithms, scalable foundation model architectures, and the …

Structure-Guided Adversarial Training of Diffusion Models

L Yang, H Qian, Z Zhang, J Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Diffusion models have demonstrated exceptional efficacy in various generative applications.
While existing models focus on minimizing a weighted sum of denoising score matching …

Consistency flow matching: Defining straight flows with velocity consistency

L Yang, Z Zhang, Z Zhang, X Liu, M Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
Flow matching (FM) is a general framework for defining probability paths via Ordinary
Differential Equations (ODEs) to transform between noise and data samples. Recent …

Distribution-aware data expansion with diffusion models

H Zhu, L Yang, JH Yong, H Yin, J Jiang, M **ao… - arxiv preprint arxiv …, 2024 - arxiv.org
The scale and quality of a dataset significantly impact the performance of deep models.
However, acquiring large-scale annotated datasets is both a costly and time-consuming …

VideoTetris: Towards Compositional Text-to-Video Generation

Y Tian, L Yang, H Yang, Y Gao, Y Deng, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have demonstrated great success in text-to-video (T2V) generation.
However, existing methods may face challenges when handling complex (long) video …

EditWorld: Simulating World Dynamics for Instruction-Following Image Editing

L Yang, B Zeng, J Liu, H Li, M Xu, W Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have significantly improved the performance of image editing. Existing
methods realize various approaches to achieve high-quality image editing, including but not …

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion

S Zhang, A Zhao, L Yang, Z Li, C Meng, H Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have been applied to 3D LiDAR scene completion due to their strong
training stability and high completion quality. However, the slow sampling speed limits the …

TextMatch: Enhancing Image-Text Consistency Through Multimodal Optimization

Y Luo, M Cheng, J Ouyang, X Tao, Q Liu - arxiv preprint arxiv:2412.18185, 2024 - arxiv.org
Text-to-image generative models excel in creating images from text but struggle with
ensuring alignment and consistency between outputs and prompts. This paper introduces …

Distribution-Aware Data Expansion with Diffusion Models

L Yang, JH Yong, H Yin, J Jiang, M **ao… - The Thirty-eighth Annual … - openreview.net
The scale and quality of a dataset significantly impact the performance of deep models.
However, acquiring large-scale annotated datasets is both a costly and time-consuming …