- Academic Search

Diffusion model-based video editing: A survey

W Sun, RC Tu, J Liao, D Tao - arxiv preprint arxiv:2407.07111, 2024 - arxiv.org

The rapid development of diffusion models (DMs) has significantly advanced image and
video applications, making" what you want is what you see" a reality. Among these, video …

Save Cite Cited by 10 Related articles View as HTML

[PDF] thecvf.com

Bootstap: Bootstrapped training for tracking-any-point

C Doersch, P Luc, Y Yang, D Gokay… - Proceedings of the …, 2024 - openaccess.thecvf.com

To endow models with greater understanding of physics and motion, it is useful to enable
them to perceive how solid surfaces move and deform in real scenes. This can be formalized …

Save Cite Cited by 17 Related articles All 2 versions Free GPT-4 View as HTML

Eto: Efficient transformer-based local feature matching by organizing multiple homography hypotheses

J Ni, G Zhang, G Li, Y Li, X Liu, Z Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

We tackle the efficiency problem of learning local feature matching. Recent advancements
have given rise to purely CNN-based and transformer-based approaches, each augmented …

Save Cite Cited by 1 Related articles All 3 versions Free GPT-4 View as HTML

ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model

FY Wang, Z Huang, Q Ma, G Song, X Lu, W Bian… - … on Computer Vision, 2024 - Springer

Although video generation has made great progress in capacity and controllability and is
gaining increasing attention, currently available video generation models still make minimal …

GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking

W Bian, Z Huang, X Shi, Y Li, FY Wang, H Li - arxiv preprint arxiv …, 2025 - arxiv.org

4D video control is essential in video generation as it enables the use of sophisticated lens
techniques, such as multi-camera shooting and dolly zoom, which are currently unsupported …

A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding

Y Dong, Y Li, Z Huang, W Bian, J Liu, H Bao… - arxiv preprint arxiv …, 2024 - arxiv.org

In this paper, we propose a novel multi-view stereo (MVS) framework that gets rid of the
depth range prior. Unlike recent prior-free MVS methods that work in a pair-wise manner …

Event-Based Tracking Any Point with Motion-Augmented Temporal Consistency

H Han, W Zhai, Y Cao, B Li, Z Zha - arxiv preprint arxiv:2412.01300, 2024 - arxiv.org

Tracking Any Point (TAP) plays a crucial role in motion analysis. Video-based approaches
rely on iterative local matching for tracking, but they assume linear motion during the blind …

EgoPoints: Advancing Point Tracking for Egocentric Videos

A Darkhalil, R Guerrier, AW Harley… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce EgoPoints, a benchmark for point tracking in egocentric videos. We annotate
4.7 K challenging tracks in egocentric sequences. Compared to the popular TAP-Vid-DAVIS …

[PDF] openreview.net

Event-aided Dense and Continuous Point Tracking

Z Wan, J Luo, Y Dai, GH Lee - openreview.net

Recent point tracking methods have made great strides in recovering the trajectories of any
point (especially key points) in long video sequences associated with large motions …