- Academic Search

M El Banani, A Raj, KK Maninis, A Kar… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent advances in large-scale pretraining have yielded visual foundation models with
strong capabilities. Not only can recent models generalize to arbitrary images for their …

保存引用被引用次数：67 相关文章所有 3 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] ecva.net

PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments

Y You, K **ong, Z Yang, Z Huang, J Zhou, R Shi… - … on Computer Vision, 2024 - Springer

Abstract We introduce PACE (Pose Annotations in Cluttered Environments), a large-scale
benchmark designed to advance the development and evaluation of pose estimation …

保存引用被引用次数：1 相关文章所有 5 个版本

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

A Engelhardt, A Raj, M Boss, Y Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present SHINOBI an end-to-end framework for the reconstruction of shape material and
illumination from object images captured with varying lighting pose and background. Inverse …

保存引用被引用次数：5 相关文章所有 3 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis

Q Zhao, S Tulsiani - Advances in Neural Information …, 2025 - proceedings.neurips.cc

Inferring the 3D structure underlying a set of multi-view images typically requires solving two
co-dependent tasks--accurate 3D reconstruction requires precise camera poses, and …

保存引用相关文章所有 5 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Customizing Text-to-Image Diffusion with Camera Viewpoint Control

N Kumari, G Su, R Zhang, T Park, E Shechtman… - arxiv preprint arxiv …, 2024 - arxiv.org

Model customization introduces new concepts to existing text-to-image models, enabling the
generation of the new concept in novel contexts. However, such methods lack accurate …

保存引用被引用次数：4 相关文章所有 2 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Y Litman, O Patashnik, K Deng, A Agrawal… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent works in inverse rendering have shown promise in using multi-view images of an
object to recover shape, albedo, and materials. However, the recovered components often …

保存引用被引用次数：1 相关文章所有 2 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Toward a holistic evaluation of robustness in clip models

W Tu, W Deng, T Gedeon - arxiv preprint arxiv:2410.01534, 2024 - arxiv.org

Contrastive Language-Image Pre-training (CLIP) models have shown significant potential,
particularly in zero-shot classification across diverse distribution shifts. Building on existing …

保存引用被引用次数：1 相关文章所有 2 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models

G Heinrich, M Ranzinger, Y Lu, J Kautz, A Tao… - arxiv preprint arxiv …, 2024 - arxiv.org

Agglomerative models have recently emerged as a powerful approach to training vision
foundation models, leveraging multi-teacher distillation from existing models such as CLIP …

保存引用被引用次数：1 相关文章所有 2 个版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

3D Congealing: 3D-Aware Image Alignment in the Wild

Y Zhang, Z Li, A Raj, A Engelhardt, Y Li, T Hou… - … on Computer Vision, 2024 - Springer

We propose 3D Congealing, a novel problem of 3D-aware alignment for 2D images
capturing semantically similar objects. Given a collection of unlabeled Internet images, our …

保存引用相关文章所有 2 个版本

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments

Y You, K **ong, Z Yang, Z Huang, J Zhou, R Shi… - arxiv preprint arxiv …, 2023 - arxiv.org

We introduce PACE (Pose Annotations in Cluttered Environments), a large-scale benchmark
designed to advance the development and evaluation of pose estimation methods in …

保存引用被引用次数：3 相关文章所有 2 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Navi: Category-agnostic image collections with high-quality 3d shape and pose annotations

Probing the 3d awareness of visual foundation models

PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments

SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis

Customizing Text-to-Image Diffusion with Camera Viewpoint Control

MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Toward a holistic evaluation of robustness in clip models

RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models

3D Congealing: 3D-Aware Image Alignment in the Wild

PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments