Object-conditioned energy-based attention map alignment in text-to-image diffusion models

Y Zhang, P Yu, YN Wu - European Conference on Computer Vision, 2024 - Springer
Text-to-image diffusion models have shown great success in generating high-quality text-
guided images. Yet, these models may still fail to semantically align generated images with …

Flow Matching: Markov Kernels, Stochastic Processes and Transport Plans

C Wald, G Steidl - arxiv preprint arxiv:2501.16839, 2025 - arxiv.org
Among generative neural models, flow matching techniques stand out for their simple
applicability and good scaling properties. Here, velocity fields of curves connecting a simple …

Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

EH Jiang, Y Zhang, Z Zhang, Y Wan… - arxiv preprint arxiv …, 2024 - arxiv.org
Text-to-image (T2I) diffusion models have revolutionized generative modeling by producing
high-fidelity, diverse, and visually realistic images from textual prompts. Despite these …