Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders
Pre-training by numerous image data has become de-facto for robust 2D representations. In
contrast, due to the expensive data processing, a paucity of 3D datasets severely hinders …
contrast, due to the expensive data processing, a paucity of 3D datasets severely hinders …
Not all features matter: Enhancing few-shot clip with adaptive prior refinement
Abstract The popularity of Contrastive Language-Image Pre-training (CLIP) has propelled its
application to diverse downstream vision tasks. To improve its capacity on downstream …
application to diverse downstream vision tasks. To improve its capacity on downstream …
Parameter is not all you need: Starting from non-parametric networks for 3d point cloud analysis
We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists
of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k …
of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k …
Viewrefer: Grasp the multi-view knowledge for 3d visual grounding
Understanding 3D scenes from multi-view inputs has been proven to alleviate the view
discrepancy issue in 3D visual grounding. However, existing methods normally neglect the …
discrepancy issue in 3D visual grounding. However, existing methods normally neglect the …
Joint-mae: 2d-3d joint masked autoencoders for 3d point cloud pre-training
Masked Autoencoders (MAE) have shown promising performance in self-supervised
learning for both 2D and 3D computer vision. However, existing MAE-style methods can only …
learning for both 2D and 3D computer vision. However, existing MAE-style methods can only …
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
To reduce the reliance on large-scale datasets recent works in 3D segmentation resort to
few-shot learning. Current 3D few-shot segmentation methods first pre-train models …
few-shot learning. Current 3D few-shot segmentation methods first pre-train models …
Viewrefer: Grasp the multi-view knowledge for 3d visual grounding with gpt and prototype guidance
Understanding 3D scenes from multi-view inputs has been proven to alleviate the view
discrepancy issue in 3D visual grounding. However, existing methods normally neglect the …
discrepancy issue in 3D visual grounding. However, existing methods normally neglect the …
Point-PEFT: Parameter-efficient fine-tuning for 3D pre-trained models
The popularity of pre-trained large models has revolutionized downstream tasks across
diverse fields, such as language, vision, and multi-modality. To minimize the adaption cost …
diverse fields, such as language, vision, and multi-modality. To minimize the adaption cost …
Tabr: Unlocking the power of retrieval-augmented tabular deep learning
Deep learning (DL) models for tabular data problems are receiving increasingly more
attention, while the algorithms based on gradient-boosted decision trees (GBDT) remain a …
attention, while the algorithms based on gradient-boosted decision trees (GBDT) remain a …
[HTML][HTML] Point cloud semantic segmentation with adaptive spatial structure graph transformer
T Han, Y Chen, J Ma, X Liu, W Zhang, X Zhang… - International Journal of …, 2024 - Elsevier
With the rapid development of LiDAR and artificial intelligence technologies, 3D point cloud
semantic segmentation has become a highlight research topic. This technology is able to …
semantic segmentation has become a highlight research topic. This technology is able to …