- Academic Search

X Wang, H Chen, Z Wu, W Zhu - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Disentangled Representation Learning (DRL) aims to learn a model capable of identifying
and disentangling the underlying factors hidden in the observable data in representation …

Save Cite Cited by 192 Related articles All 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Vmc: Video motion customization using temporal attention adaption for text-to-video diffusion models

H Jeong, GY Park, JC Ye - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Text-to-video diffusion models have advanced video generation significantly. However
customizing these models to generate videos with tailored motions presents a substantial …

Save Cite Cited by 42 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

Save Cite Cited by 47 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Motionbooth: Motion-aware customized text-to-video generation

J Wu, X Li, Y Zeng, J Zhang, Q Zhou, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org

In this work, we present MotionBooth, an innovative framework designed for animating
customized subjects with precise control over both object and camera movements. By …

Save Cite Cited by 15 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

InstructVideo: instructing video diffusion models with human feedback

H Yuan, S Zhang, X Wang, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion models have emerged as the de facto paradigm for video generation. However
their reliance on web-scale data of varied quality often yields results that are visually …

Save Cite Cited by 31 Related articles All 4 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MC: Multi-concept Guidance for Customized Multi-concept Generation

J Jiang, Y Zhang, K Feng, X Wu, W Li, R Pei… - arxiv preprint arxiv …, 2024 - arxiv.org

Customized text-to-image generation, which synthesizes images based on user-specified
concepts, has made significant progress in handling individual concepts. However, when …

Save Cite Cited by 9 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Disenstudio: Customized multi-subject text-to-video generation with disentangled spatial control

H Chen, X Wang, Y Zhang, Y Zhou, Z Zhang… - Proceedings of the …, 2024 - dl.acm.org

Generating customized content in videos has received increasing attention recently.
However, existing works primarily focus on customized text-to-video generation for single …

Save Cite Cited by 6 Related articles All 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] pkwyx.com

Magdiff: Multi-alignment diffusion for high-fidelity video generation and editing

H Zhao, T Lu, J Gu, X Zhang, Q Zheng, Z Wu… - … on Computer Vision, 2024 - Springer

The diffusion model is widely leveraged for either video generation or video editing. As each
field has its task-specific problems, it is difficult to merely develop a single diffusion for …

Save Cite Cited by 2 Related articles All 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-modal generative ai: Multi-modal llm, diffusion and beyond

H Chen, X Wang, Y Zhou, B Huang, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Multi-modal generative AI has received increasing attention in both academia and industry.
Particularly, two dominant families of techniques are: i) The multi-modal large language …

Save Cite Cited by 6 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Videoassembler: Identity-consistent video generation with reference entities using diffusion model

H Zhao, T Lu, J Gu, X Zhang, Z Wu, H Xu… - arxiv preprint arxiv …, 2023 - arxiv.org

Identity-consistent video generation seeks to synthesize videos that are guided by both
textual prompts and reference images of entities. Current approaches typically utilize cross …

Save Cite Cited by 6 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

Create alert

Cite

Advanced search

Saved to My library

Videodreamer: Customized multi-subject text-to-video generation with disen-mix finetuning

Disentangled representation learning

Vmc: Video motion customization using temporal attention adaption for text-to-video diffusion models

Videobooth: Diffusion-based video generation with image prompts

Motionbooth: Motion-aware customized text-to-video generation

InstructVideo: instructing video diffusion models with human feedback

MC: Multi-concept Guidance for Customized Multi-concept Generation

Disenstudio: Customized multi-subject text-to-video generation with disentangled spatial control

Magdiff: Multi-alignment diffusion for high-fidelity video generation and editing

Multi-modal generative ai: Multi-modal llm, diffusion and beyond

Videoassembler: Identity-consistent video generation with reference entities using diffusion model