- Academic Search

S Kobayashi, E Matsumoto… - Advances in Neural …, 2022 - proceedings.neurips.cc

Emerging neural radiance fields (NeRF) are a promising scene representation for computer
graphics, enabling high-quality 3D reconstruction and novel view synthesis from image …

保存引用被引用次数：336 相关文章所有 5 个版本网页快照

[Free GPT-4]

[PDF] thecvf.com

Learning visual representations via language-guided sampling

M El Banani, K Desai… - Proceedings of the ieee …, 2023 - openaccess.thecvf.com

Although an object may appear in numerous contexts, we often describe it in a limited
number of ways. Language allows us to abstract away visual variation to represent and …

保存引用被引用次数：33 相关文章所有 7 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Featurenerf: Learning generalizable nerfs by distilling foundation models

J Ye, N Wang, X Wang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

Recent works on generalizable NeRFs have shown promising results on novel view
synthesis from single or few images. However, such models have rarely been applied on …

保存引用被引用次数：36 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Doppelgangers: Learning to disambiguate images of similar structures

R Cai, J Tung, Q Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

We consider the visual disambiguation task of determining whether a pair of visually similar
images depict the same or distinct 3D surfaces (eg, the same or opposite sides of a …

保存引用被引用次数：14 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Megascenes: Scene-level view synthesis at scale

J Tung, G Chou, R Cai, G Yang, K Zhang… - … on Computer Vision, 2024 - Springer

Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics
applications. Recently, pose-conditioned diffusion models have led to significant progress …

保存引用被引用次数：7 相关文章

[Free GPT-4]

[PDF] thecvf.com

Partglot: Learning shape part segmentation from language reference games

J Koo, I Huang, P Achlioptas… - Proceedings of the …, 2022 - openaccess.thecvf.com

We introduce PartGlot, a neural framework and associated architectures for learning
semantic part segmentation of 3D shape geometry, based solely on part referential …

保存引用被引用次数：33 相关文章所有 7 个版本 HTML 版

[Free GPT-4]

[PDF] mdpi.com

A digital 4D information system on the world scale: research challenges, approaches, and preliminary results

S Münster, F Maiwald, J Bruschke, C Kröber, Y Sun… - Applied Sciences, 2024 - mdpi.com

Numerous digital media repositories have been set up during recent decades, each
containing plenty of data about historic cityscapes. In contrast, digital 3D reconstructions of …

保存引用被引用次数：6 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

DNet: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding

DZ Chen, Q Wu, M Nießner, AX Chang - European Conference on …, 2022 - Springer

Recent work on dense captioning and visual grounding in 3D have achieved impressive
results. Despite developments in both areas, the limited amount of available 3D vision …

保存引用被引用次数：31 相关文章所有 4 个版本

[Free GPT-4]

[PDF] wiley.com

HaLo‐NeRF: Learning Geometry‐Guided Semantics for Exploring Unconstrained Photo Collections

C Dudai, M Alper, H Bezalel, R Hanocka… - Computer Graphics …, 2024 - Wiley Online Library

Internet image collections containing photos captured by crowds of photographers show
promise for enabling digital exploration of large‐scale tourist landmarks. However, prior …

保存引用被引用次数：1 相关文章所有 5 个版本

[Free GPT-4]

[PDF] thecvf.com

TriCoLo: Trimodal contrastive loss for text to shape retrieval

Y Ruan, HH Lee, Y Zhang, K Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Text-to-shape retrieval is an increasingly relevant problem with the growth of 3D shape data.
Recent work on contrastive losses for learning joint embeddings over multimodal data has …

保存引用被引用次数：9 相关文章所有 6 个版本图书馆搜索 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Towers of babel: Combining images, language, and 3d geometry for learning multimodal vision

Decomposing nerf for editing via feature field distillation

Learning visual representations via language-guided sampling

Featurenerf: Learning generalizable nerfs by distilling foundation models

Doppelgangers: Learning to disambiguate images of similar structures

Megascenes: Scene-level view synthesis at scale

Partglot: Learning shape part segmentation from language reference games

A digital 4D information system on the world scale: research challenges, approaches, and preliminary results

DNet: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding

HaLo‐NeRF: Learning Geometry‐Guided Semantics for Exploring Unconstrained Photo Collections

TriCoLo: Trimodal contrastive loss for text to shape retrieval