Cotracker: It is better to track together

N Karaev, I Rocco, B Graham, N Neverova… - … on Computer Vision, 2024 - Springer
We introduce CoTracker, a transformer-based model that tracks a large number of 2D points
in long video sequences. Differently from most existing approaches that track points …

Depth pro: Sharp monocular metric depth in less than a second

A Bochkovskii, A Delaunoy, H Germain… - arxiv preprint arxiv …, 2024 - arxiv.org
We present a foundation model for zero-shot metric monocular depth estimation. Our model,
Depth Pro, synthesizes high-resolution depth maps with unparalleled sharpness and high …

Depthcrafter: Generating consistent long depth sequences for open-world videos

W Hu, X Gao, X Li, S Zhao, X Cun, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite significant advancements in monocular depth estimation for static images,
estimating video depth in the open world remains challenging, since open-world videos are …

CoTracker3: Simpler and better point tracking by pseudo-labelling real videos

N Karaev, I Makarov, J Wang, N Neverova… - arxiv preprint arxiv …, 2024 - arxiv.org
Most state-of-the-art point trackers are trained on synthetic data due to the difficulty of
annotating real videos for this task. However, this can result in suboptimal performance due …

Match-stereo-videos: Bidirectional alignment for consistent dynamic stereo matching

J **g, Y Mao, K Mikolajczyk - European Conference on Computer Vision, 2024 - Springer
Dynamic stereo matching is the task of estimating consistent disparities from stereo videos
with dynamic objects. Recent learning-based methods prioritize optimal performance on a …

Temporally consistent stereo matching

J Zeng, C Yao, Y Wu, Y Jia - European Conference on Computer Vision, 2024 - Springer
Stereo matching provides depth estimation from binocular images for downstream
applications. These applications mostly take video streams as input and require temporally …

A survey on deep stereo matching in the twenties

F Tosi, L Bartolomei, M Poggi - arxiv preprint arxiv:2407.07816, 2024 - arxiv.org
Stereo matching is close to hitting a half-century of history, yet witnessed a rapid evolution in
the last decade thanks to deep learning. While previous surveys in the late 2010s covered …

Depth any video with scalable synthetic data

H Yang, D Huang, W Yin, C Shen, H Liu, X He… - arxiv preprint arxiv …, 2024 - arxiv.org
Video depth estimation has long been hindered by the scarcity of consistent and scalable
ground truth data, leading to inconsistent and unreliable results. In this paper, we introduce …

A comprehensive review of quality of experience for emerging video services

W Chen, F Lan, H Wei, T Zhao, W Liu, Y Xu - Signal Processing: Image …, 2024 - Elsevier
The recent advances in multimedia technology have significantly expanded the range of
audio-visual applications. The continuous enhancement of display quality has led to the …

A Comprehensive Review of Vision-Based 3D Reconstruction Methods

L Zhou, G Wu, Y Zuo, X Chen, H Hu - Sensors, 2024 - mdpi.com
With the rapid development of 3D reconstruction, especially the emergence of algorithms
such as NeRF and 3DGS, 3D reconstruction has become a popular research topic in recent …