[HTML][HTML] DILF: Differentiable rendering-based multi-view Image–Language Fusion for zero-shot 3D shape understanding

X Ning, Z Yu, L Li, W Li, P Tiwari - Information Fusion, 2024 - Elsevier
Zero-shot 3D shape understanding aims to recognize “unseen” 3D categories that are not
present in training data. Recently, Contrastive Language–Image Pre-training (CLIP) has …

Cascade-zero123: One image to highly consistent 3d with self-prompted nearby views

Y Chen, J Fang, Y Huang, T Yi, X Zhang, L **e… - … on Computer Vision, 2024 - Springer
Synthesizing multi-view 3D from one single image is a significant but challenging task. Zero-
1-to-3 methods have achieved great success by lifting a 2D latent diffusion model to the 3D …

Pedestrian 3d shape understanding for person re-identification via multi-view learning

Z Yu, L Li, J **e, C Wang, W Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recent development in computing power has resulted in performance improvements on
holistic (none-occluded) person Re-Identification (ReID) tasks. Nevertheless, the precision …

Egoloc: Revisiting 3d object localization from egocentric videos with visual queries

J Mai, A Hamdi, S Giancola, C Zhao… - Proceedings of the …, 2023 - openaccess.thecvf.com
With the recent advances in video and 3D understanding, novel 4D spatio-temporal
methods fusing both concepts have emerged. Towards this direction, the Ego4D Episodic …

Metadreamer: Efficient text-to-3d creation with disentangling geometry and texture

L Feng, M Wang, M Wang, K Xu, X Liu - arxiv preprint arxiv:2311.10123, 2023 - arxiv.org
Generative models for 3D object synthesis have seen significant advancements with the
incorporation of prior knowledge distilled from 2D diffusion models. Nevertheless …

Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers

L Zou, W Zhu, K Chen, L Guo, K Guo… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Semantic pattern of an object point cloud is determined by its topological configuration of
local geometries. Learning discriminative representations can be challenging due to large …

Mobile volumetric video streaming system through implicit neural representation

J Liu, Y Wang, Y Wang, Y Wang, S Cui… - Proceedings of the 2023 …, 2023 - dl.acm.org
Volumetric video (VV) emerges as a new video paradigm with six degree-of-freedom (DoF)
immersive viewing experience. Most existing VV systems focus on the point cloud (PtCl) …

Ego3DT: Tracking Every 3D Object in Ego-centric Videos

S Hao, W Chai, Z Zhao, M Sun, W Hu, J Zhou… - Proceedings of the …, 2024 - dl.acm.org
The growing interest in embodied intelligence has brought ego-centric perspectives to
contemporary research. One significant challenge within this realm is the accurate …

CAD-NeRF: learning NeRFs from uncalibrated few-view images by CAD model retrieval

X Wen, X Zhu, R Yi, Z Wang, C Zhu, K Xu - Frontiers of Computer Science, 2025 - Springer
Reconstructing from multi-view images is a longstanding problem in 3D vision, where neural
radiance fields (NeRFs) have shown great potential and get realistic rendered images of …

LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors

Y Chen, C Yang, J Fang, X Zhang, L **e… - arxiv preprint arxiv …, 2024 - arxiv.org
Single-image 3D reconstruction remains a fundamental challenge in computer vision due to
inherent geometric ambiguities and limited viewpoint information. Recent advances in …