Controllable generation with text-to-image diffusion models: A survey

P Cao, F Zhou, Q Song, L Yang - arxiv preprint arxiv:2403.04279, 2024 - arxiv.org
In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …

Concept-centric personalization with large-scale diffusion priors

P Cao, L Yang, F Zhou, T Huang, Q Song - arxiv preprint arxiv …, 2023 - arxiv.org
Despite large-scale diffusion models being highly capable of generating diverse open-world
content, they still struggle to match the photorealism and fidelity of concept-specific …

Lifting by image–leveraging image cues for accurate 3d human pose estimation

F Zhou, J Yin, P Li - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
The" lifting from 2D pose" method has been the dominant approach to 3D Human Pose
Estimation (3DHPE) due to the powerful visual analysis ability of 2D pose estimators. Widely …

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

D Bobkov, V Titov, A Alanov… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The task of manipulating real image attributes through StyleGAN inversion has been
extensively researched. This process involves searching latent variables from a well-trained …

Contrastive Decoupled Representation Learning and Regularization for Speech-Preserving Facial Expression Manipulation

T Chen, J Lin, Z Yang, C Qing, Y Shi, L Lin - International Journal of …, 2025 - Springer
Speech-preserving facial expression manipulation (SPFEM) aims to modify a talking head to
display a specific reference emotion while preserving the mouth animation of source spoken …

E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance

T Huang, P Cao, L Yang, C Liu, M Hu, Z Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion-based image editing is a composite process of preserving the source image
content and generating new content or applying modifications. While current editing …