Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling
We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …
image-to-video generation (I2V). In contrast to previous methods that directly learn the …
Diffusion model-based video editing: A survey
The rapid development of diffusion models (DMs) has significantly advanced image and
video applications, making" what you want is what you see" a reality. Among these, video …
video applications, making" what you want is what you see" a reality. Among these, video …
Bootstap: Bootstrapped training for tracking-any-point
To endow models with greater understanding of physics and motion, it is useful to enable
them to perceive how solid surfaces move and deform in real scenes. This can be formalized …
them to perceive how solid surfaces move and deform in real scenes. This can be formalized …
Eto: Efficient transformer-based local feature matching by organizing multiple homography hypotheses
We tackle the efficiency problem of learning local feature matching. Recent advancements
have given rise to purely CNN-based and transformer-based approaches, each augmented …
have given rise to purely CNN-based and transformer-based approaches, each augmented …
ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model
Although video generation has made great progress in capacity and controllability and is
gaining increasing attention, currently available video generation models still make minimal …
gaining increasing attention, currently available video generation models still make minimal …
GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking
4D video control is essential in video generation as it enables the use of sophisticated lens
techniques, such as multi-camera shooting and dolly zoom, which are currently unsupported …
techniques, such as multi-camera shooting and dolly zoom, which are currently unsupported …
A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding
In this paper, we propose a novel multi-view stereo (MVS) framework that gets rid of the
depth range prior. Unlike recent prior-free MVS methods that work in a pair-wise manner …
depth range prior. Unlike recent prior-free MVS methods that work in a pair-wise manner …
Event-Based Tracking Any Point with Motion-Augmented Temporal Consistency
Tracking Any Point (TAP) plays a crucial role in motion analysis. Video-based approaches
rely on iterative local matching for tracking, but they assume linear motion during the blind …
rely on iterative local matching for tracking, but they assume linear motion during the blind …
EgoPoints: Advancing Point Tracking for Egocentric Videos
We introduce EgoPoints, a benchmark for point tracking in egocentric videos. We annotate
4.7 K challenging tracks in egocentric sequences. Compared to the popular TAP-Vid-DAVIS …
4.7 K challenging tracks in egocentric sequences. Compared to the popular TAP-Vid-DAVIS …
Event-aided Dense and Continuous Point Tracking
Z Wan, J Luo, Y Dai, GH Lee - openreview.net
Recent point tracking methods have made great strides in recovering the trajectories of any
point (especially key points) in long video sequences associated with large motions …
point (especially key points) in long video sequences associated with large motions …