- Academic Search

D Geng, C Herrmann, J Hur, F Cole, S Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Motion control is crucial for generating expressive and compelling video content; however,
most existing video generation models rely mainly on text prompts for control, which struggle …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation

H Jeong, CHP Huang, JC Ye, N Mitra… - arxiv preprint arxiv …, 2024 - arxiv.org

While recent foundational video generators produce visually rich output, they still struggle
with appearance drift, where objects gradually degrade or change inconsistently across …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models

H Kim, S Beak, H Joo - arxiv preprint arxiv:2501.08333, 2025 - arxiv.org

Understanding the ability of humans to use objects is crucial for AI to improve daily life.
Existing studies for learning such ability focus on human-object patterns (eg, contact, spatial …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

InterDyn: Controllable Interactive Dynamics with Video Diffusion Models

R Akkerman, H Feng, MJ Black, D Tzionas… - arxiv preprint arxiv …, 2024 - arxiv.org

Predicting the dynamics of interacting objects is essential for both humans and intelligent
systems. However, existing approaches are limited to simplified, toy settings and lack …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

Y Zhang, X Zhou, Y Zeng, H Xu, H Li, W Zuo - arxiv preprint arxiv …, 2025 - arxiv.org

Interactive image editing allows users to modify images through visual interaction operations
such as drawing, clicking, and dragging. Existing methods construct such supervision …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer

X Liu, A Zeng, W Xue, H Yang, W Luo, Q Liu… - arxiv preprint arxiv …, 2025 - arxiv.org

Crafting magic and illusions is one of the most thrilling aspects of filmmaking, with visual
effects (VFX) serving as the powerhouse behind unforgettable cinematic experiences. While …

Zapisz Cytuj Powiązane artykuły Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Track-On: Transformer-based Online Point Tracking with Memory

G Aydemir, X Cai, W **e, F Güney - arxiv preprint arxiv:2501.18487, 2025 - arxiv.org

In this paper, we consider the problem of long-term point tracking, which requires consistent
identification of points across multiple frames in a video, despite changes in appearance …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

J Qu, H Li, S Liu, T Ren, Z Zeng, L Zhang - arxiv preprint arxiv:2411.18671, 2024 - arxiv.org

In this paper, we present TAPTRv3, which is built upon TAPTRv2 to improve its point
tracking robustness in long videos. TAPTRv2 is a simple DETR-like framework that can …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving Vision-Language-Action Models via Chain-of-Affordance

J Li, Y Zhu, Z Tang, J Wen, M Zhu, X Liu, C Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Robot foundation models, particularly Vision-Language-Action (VLA) models, have
garnered significant attention for their ability to enhance robot policy learning, greatly …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring Temporally-Aware Features for Point Tracking

IH Kim, S Cho, J Huang, J Yi, JY Lee, S Kim - arxiv preprint arxiv …, 2025 - arxiv.org

Point tracking in videos is a fundamental task with applications in robotics, video editing, and
more. While many vision tasks benefit from pre-trained feature backbones to improve …

Zapisz Cytuj Powiązane artykuły Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

CoTracker3: Simpler and better point tracking by pseudo-labelling real videos

Motion Prompting: Controlling Video Generation with Motion Trajectories

Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation

DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models

InterDyn: Controllable Interactive Dynamics with Video Diffusion Models

FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer

Track-On: Transformer-based Online Point Tracking with Memory

TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

Improving Vision-Language-Action Models via Chain-of-Affordance

Exploring Temporally-Aware Features for Point Tracking