- Academic Search

AAM Muzahid, H Han, Y Zhang, D Li, Y Zhang… - Neurocomputing, 2024 - Elsevier

With the growing availability of extensive 3D datasets and the rapid progress in
computational power, deep learning (DL) has emerged as a highly promising approach for …

Enregistrer Citer Cité 3 fois Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Depthcrafter: Generating consistent long depth sequences for open-world videos

W Hu, X Gao, X Li, S Zhao, X Cun, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Despite significant advancements in monocular depth estimation for static images,
estimating video depth in the open world remains challenging, since open-world videos are …

Enregistrer Citer Cité 25 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Monst3r: A simple approach for estimating geometry in the presence of motion

J Zhang, C Herrmann, J Hur, V Jampani… - arxiv preprint arxiv …, 2024 - arxiv.org

Estimating geometry from dynamic scenes, where objects move and deform over time,
remains a core challenge in computer vision. Current approaches often rely on multi-stage …

Enregistrer Citer Cité 24 fois Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Lotus: Diffusion-based visual foundation model for high-quality dense prediction

J He, H Li, W Yin, Y Liang, L Li, K Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org

Leveraging the visual priors of pre-trained text-to-image diffusion models offers a promising
solution to enhance zero-shot generalization in dense prediction tasks. However, existing …

Enregistrer Citer Cité 18 fois Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Unimatch v2: Pushing the limit of semi-supervised semantic segmentation

L Yang, Z Zhao, H Zhao - IEEE Transactions on Pattern …, 2025 - ieeexplore.ieee.org

Semi-supervised semantic segmentation (SSS) aims at learning rich visual knowledge from
cheap unlabeled images to enhance semantic segmentation capability. Among recent …

Enregistrer Citer Cité 3 fois Autres articles Les 3 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Dynamic gaussian marbles for novel view synthesis of casual monocular videos

C Stearns, A Harley, M Uy, F Dubost… - SIGGRAPH Asia 2024 …, 2024 - dl.acm.org

Gaussian splatting has become a popular representation for novel-view synthesis, exhibiting
clear strengths in efficiency, photometric quality, and compositional edibility. Following its …

Enregistrer Citer Cité 7 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision

R Wang, S Xu, C Dai, J **ang, Y Deng, X Tong… - arxiv preprint arxiv …, 2024 - arxiv.org

We present MoGe, a powerful model for recovering 3D geometry from monocular open-
domain images. Given a single image, our model directly predicts a 3D point map of the …

Enregistrer Citer Cité 6 fois Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Compressed depth map super-resolution and restoration: AIM 2024 challenge results

MV Conde, FA Vasluianu, J **ong, W Ye… - arxiv preprint arxiv …, 2024 - arxiv.org

The increasing demand for augmented reality (AR) and virtual reality (VR) applications
highlights the need for efficient depth information processing. Depth maps, essential for …

Enregistrer Citer Cité 7 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Unveiling deep shadows: A survey on image and video shadow detection, removal, and generation in the era of deep learning

X Hu, Z **ng, T Wang, CW Fu, PA Heng - arxiv preprint arxiv:2409.02108, 2024 - arxiv.org

Shadows are formed when light encounters obstacles, leading to areas of diminished
illumination. In computer vision, shadow detection, removal, and generation are crucial for …

Enregistrer Citer Cité 4 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

PixWizard: Versatile image-to-image visual assistant with open-language instructions

W Lin, X Wei, R Zhang, L Zhuo, S Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper presents a versatile image-to-image visual assistant, PixWizard, designed for
image generation, manipulation, and translation based on free-from language instructions …

Enregistrer Citer Cité 3 fois Autres articles Les 2 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Depth Anything V2

Deep learning for 3D object recognition: A survey

Depthcrafter: Generating consistent long depth sequences for open-world videos

Monst3r: A simple approach for estimating geometry in the presence of motion

Lotus: Diffusion-based visual foundation model for high-quality dense prediction

Unimatch v2: Pushing the limit of semi-supervised semantic segmentation

Dynamic gaussian marbles for novel view synthesis of casual monocular videos

Moge: Unlocking accurate monocular geometry estimation for open-domain images with optimal training supervision

Compressed depth map super-resolution and restoration: AIM 2024 challenge results

Unveiling deep shadows: A survey on image and video shadow detection, removal, and generation in the era of deep learning

PixWizard: Versatile image-to-image visual assistant with open-language instructions