A neural space-time representation for text-to-image personalization

Y Alaluf, E Richardson, G Metzer… - ACM Transactions on …, 2023 - dl.acm.org
A key aspect of text-to-image personalization methods is the manner in which the target
concept is represented within the generative process. This choice greatly affects the visual …

Exploring low-dimensional subspaces in diffusion models for controllable image editing

S Chen, H Zhang, M Guo, Y Lu, P Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recently, diffusion models have emerged as a powerful class of generative models. Despite
their success, there is still limited understanding of their semantic spaces. This makes it …

Few-shot image generation by conditional relaxing diffusion inversion

Y Cao, S Gong - European Conference on Computer Vision, 2024 - Springer
In the field of Few-Shot Image Generation (FSIG) using Deep Generative Models (DGMs),
accurately estimating the distribution of target domain with minimal samples poses a …

Vision+ x: A survey on multimodal learning in the light of data

Y Zhu, Y Wu, N Sebe, Y Yan - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
We are perceiving and communicating with the world in a multisensory manner, where
different information sources are sophisticatedly processed and interpreted by separate …

Diffusion in diffusion: Cyclic one-way diffusion for text-vision-conditioned generation

R Wang, Y Yang, Z Qian, Y Zhu, Y Wu - arxiv preprint arxiv:2306.08247, 2023 - arxiv.org
Originating from the diffusion phenomenon in physics that describes particle movement, the
diffusion generative models inherit the characteristics of stochastic random walk in the data …

Training-free Content Injection using h-space in Diffusion Models

J Jeong, M Kwon, Y Uh - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com
Diffusion models (DMs) synthesize high-quality images in various domains. However,
controlling their generative process is still hazy because the intermediate variables in the …

One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls

M Hu, J Zheng, C Zheng, C Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com
It is well known that many open-released foundational diffusion models have difficulty in
generating images that substantially depart from average brightness despite such images …

PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation

Z Zeng, J Ni, D Wang, P Rim, Y Chung, F Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper explores the potential of leveraging language priors learned by text-to-image
diffusion models to address ambiguity and visual nuisance in monocular depth estimation …

Unseen Image Synthesis with Diffusion Models

Y Zhu, Y Wu, Z Deng, O Russakovsky, Y Yan - arxiv preprint arxiv …, 2023 - arxiv.org
While the current trend in the generative field is scaling up towards larger models and more
training data for generalized domain representations, we go the opposite direction in this …

A Deep Learning Approach for Stochastic Structural Plane Generation Based on Denoising Diffusion Probabilistic Models

H Meng, X Qi, G Mei - Mathematics, 2024 - mdpi.com
The stochastic structural plane of a rock mass is the key factor controlling the stability of rock
mass. Obtaining the distribution of stochastic structural planes within a rock mass is crucial …