Decomposing nerf for editing via feature field distillation

S Kobayashi, E Matsumoto… - Advances in Neural …, 2022 - proceedings.neurips.cc
Emerging neural radiance fields (NeRF) are a promising scene representation for computer
graphics, enabling high-quality 3D reconstruction and novel view synthesis from image …

Learning visual representations via language-guided sampling

M El Banani, K Desai… - Proceedings of the ieee …, 2023 - openaccess.thecvf.com
Although an object may appear in numerous contexts, we often describe it in a limited
number of ways. Language allows us to abstract away visual variation to represent and …

Featurenerf: Learning generalizable nerfs by distilling foundation models

J Ye, N Wang, X Wang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Recent works on generalizable NeRFs have shown promising results on novel view
synthesis from single or few images. However, such models have rarely been applied on …

Doppelgangers: Learning to disambiguate images of similar structures

R Cai, J Tung, Q Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
We consider the visual disambiguation task of determining whether a pair of visually similar
images depict the same or distinct 3D surfaces (eg, the same or opposite sides of a …

Megascenes: Scene-level view synthesis at scale

J Tung, G Chou, R Cai, G Yang, K Zhang… - … on Computer Vision, 2024 - Springer
Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics
applications. Recently, pose-conditioned diffusion models have led to significant progress …

Partglot: Learning shape part segmentation from language reference games

J Koo, I Huang, P Achlioptas… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce PartGlot, a neural framework and associated architectures for learning
semantic part segmentation of 3D shape geometry, based solely on part referential …

A digital 4D information system on the world scale: research challenges, approaches, and preliminary results

S Münster, F Maiwald, J Bruschke, C Kröber, Y Sun… - Applied Sciences, 2024 - mdpi.com
Numerous digital media repositories have been set up during recent decades, each
containing plenty of data about historic cityscapes. In contrast, digital 3D reconstructions of …

DNet: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding

DZ Chen, Q Wu, M Nießner, AX Chang - European Conference on …, 2022 - Springer
Recent work on dense captioning and visual grounding in 3D have achieved impressive
results. Despite developments in both areas, the limited amount of available 3D vision …

HaLo‐NeRF: Learning Geometry‐Guided Semantics for Exploring Unconstrained Photo Collections

C Dudai, M Alper, H Bezalel, R Hanocka… - Computer Graphics …, 2024 - Wiley Online Library
Internet image collections containing photos captured by crowds of photographers show
promise for enabling digital exploration of large‐scale tourist landmarks. However, prior …

TriCoLo: Trimodal contrastive loss for text to shape retrieval

Y Ruan, HH Lee, Y Zhang, K Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-to-shape retrieval is an increasingly relevant problem with the growth of 3D shape data.
Recent work on contrastive losses for learning joint embeddings over multimodal data has …