Google Tudós

S Yuan, J Huang, X He, Y Ge, Y Shi, L Chen… - arxiv preprint arxiv …, 2024 - arxiv.org

Identity-preserving text-to-video (IPT2V) generation aims to create high-fidelity videos with
consistent human identity. It is an important task in video generation but remains an open …

Mentés Hivatkozás Idézetek száma: 4 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] openreview.net

Non-uniform timestep sampling: Towards faster diffusion model training

T Zheng, C Geng, PT Jiang, B Wan, H Zhang… - Proceedings of the …, 2024 - dl.acm.org

Diffusion models have garnered significant success in generative tasks, emerging as the
predominant model in this domain. Despite their success, the substantial computational …

Mentés Hivatkozás Idézetek száma: 5 Kapcsolódó cikkek Mind a(z) 3 változat

[Free GPT-4]

[PDF] arxiv.org

Multi-modal generative ai: Multi-modal llm, diffusion and beyond

H Chen, X Wang, Y Zhou, B Huang, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Multi-modal generative AI has received increasing attention in both academia and industry.
Particularly, two dominant families of techniques are: i) The multi-modal large language …

Mentés Hivatkozás Idézetek száma: 6 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

Motion Prompting: Controlling Video Generation with Motion Trajectories

D Geng, C Herrmann, J Hur, F Cole, S Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Motion control is crucial for generating expressive and compelling video content; however,
most existing video generation models rely mainly on text prompts for control, which struggle …

Mentés Hivatkozás Idézetek száma: 2 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

Cami2v: Camera-controlled image-to-video diffusion model

G Zheng, T Li, R Jiang, Y Lu, T Wu, X Li - arxiv preprint arxiv:2410.15957, 2024 - arxiv.org

Recently, camera pose, as a user-friendly and physics-related condition, has been
introduced into text-to-video diffusion model for camera control. However, existing methods …

Mentés Hivatkozás Idézetek száma: 3 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

J Wu, C Tang, J Wang, Y Zeng, X Li, Y Tong - arxiv preprint arxiv …, 2024 - arxiv.org

Story visualization, the task of creating visual narratives from textual descriptions, has seen
progress with text-to-image generation models. However, these models often lack effective …

Mentés Hivatkozás Idézetek száma: 1 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models

H Kim, S Beak, H Joo - arxiv preprint arxiv:2501.08333, 2025 - arxiv.org

Understanding the ability of humans to use objects is crucial for AI to improve daily life.
Existing studies for learning such ability focus on human-object patterns (eg, contact, spatial …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

Trajectory Attention for Fine-grained Video Motion Control

Z **ao, W Ouyang, Y Zhou, S Yang, L Yang, J Si… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advancements in video generation have been greatly driven by video diffusion
models, with camera motion control emerging as a crucial challenge in creating view …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning

Y Han, J Zhu, Y Feng, X Ji, K He, X Li, Y Liu - arxiv preprint arxiv …, 2024 - arxiv.org

Current diffusion-based face animation methods generally adopt a ReferenceNet (a copy of
U-Net) and a large amount of curated self-acquired data to learn appearance features, as …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

[Free GPT-4]

[PDF] arxiv.org

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Y Wu, Z Zhang, Y Li, Y Xu, A Kag, Y Sui… - arxiv preprint arxiv …, 2024 - arxiv.org

We have witnessed the unprecedented success of diffusion-based video generation over
the past year. Recently proposed models from the community have wielded the power to …

Mentés Hivatkozás Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Motionbooth: Motion-aware customized text-to-video generation

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Non-uniform timestep sampling: Towards faster diffusion model training

Multi-modal generative ai: Multi-modal llm, diffusion and beyond

Motion Prompting: Controlling Video Generation with Motion Trajectories

Cami2v: Camera-controlled image-to-video diffusion model

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models

Trajectory Attention for Fine-grained Video Motion Control

MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device