[HTML][HTML] DILF: Differentiable rendering-based multi-view Image–Language Fusion for zero-shot 3D shape understanding
Zero-shot 3D shape understanding aims to recognize “unseen” 3D categories that are not
present in training data. Recently, Contrastive Language–Image Pre-training (CLIP) has …
present in training data. Recently, Contrastive Language–Image Pre-training (CLIP) has …
Cascade-zero123: One image to highly consistent 3d with self-prompted nearby views
Synthesizing multi-view 3D from one single image is a significant but challenging task. Zero-
1-to-3 methods have achieved great success by lifting a 2D latent diffusion model to the 3D …
1-to-3 methods have achieved great success by lifting a 2D latent diffusion model to the 3D …
Pedestrian 3d shape understanding for person re-identification via multi-view learning
Recent development in computing power has resulted in performance improvements on
holistic (none-occluded) person Re-Identification (ReID) tasks. Nevertheless, the precision …
holistic (none-occluded) person Re-Identification (ReID) tasks. Nevertheless, the precision …
Egoloc: Revisiting 3d object localization from egocentric videos with visual queries
With the recent advances in video and 3D understanding, novel 4D spatio-temporal
methods fusing both concepts have emerged. Towards this direction, the Ego4D Episodic …
methods fusing both concepts have emerged. Towards this direction, the Ego4D Episodic …
Metadreamer: Efficient text-to-3d creation with disentangling geometry and texture
L Feng, M Wang, M Wang, K Xu, X Liu - arxiv preprint arxiv:2311.10123, 2023 - arxiv.org
Generative models for 3D object synthesis have seen significant advancements with the
incorporation of prior knowledge distilled from 2D diffusion models. Nevertheless …
incorporation of prior knowledge distilled from 2D diffusion models. Nevertheless …
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers
Semantic pattern of an object point cloud is determined by its topological configuration of
local geometries. Learning discriminative representations can be challenging due to large …
local geometries. Learning discriminative representations can be challenging due to large …
Mobile volumetric video streaming system through implicit neural representation
Volumetric video (VV) emerges as a new video paradigm with six degree-of-freedom (DoF)
immersive viewing experience. Most existing VV systems focus on the point cloud (PtCl) …
immersive viewing experience. Most existing VV systems focus on the point cloud (PtCl) …
Ego3DT: Tracking Every 3D Object in Ego-centric Videos
The growing interest in embodied intelligence has brought ego-centric perspectives to
contemporary research. One significant challenge within this realm is the accurate …
contemporary research. One significant challenge within this realm is the accurate …
CAD-NeRF: learning NeRFs from uncalibrated few-view images by CAD model retrieval
Reconstructing from multi-view images is a longstanding problem in 3D vision, where neural
radiance fields (NeRFs) have shown great potential and get realistic rendered images of …
radiance fields (NeRFs) have shown great potential and get realistic rendered images of …
LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors
Single-image 3D reconstruction remains a fundamental challenge in computer vision due to
inherent geometric ambiguities and limited viewpoint information. Recent advances in …
inherent geometric ambiguities and limited viewpoint information. Recent advances in …