PointCLIP: Point Cloud Understanding by CLIP R Zhang*, Z Guo*, W Zhang, K Li, X Miao, B Cui, Y Qiao, P Gao, H Li CVPR 2022, 8552-8562, 2022 | 465 | 2022 |
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training R Zhang, Z Guo, P Gao, R Fang, B Zhao, D Wang, Y Qiao, H Li NeurIPS 2022, 2022 | 264 | 2022 |
Personalize Segment Anything Model with One Shot R Zhang, Z Jiang, Z Guo, S Yan, J Pan, H Dong, P Gao, H Li ICLR 2024, 2023 | 197 | 2023 |
MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection R Zhang, H Qiu, T Wang, Z Guo, Z Cui, Y Qiao, H Li, P Gao ICCV 2023, 9155-9166, 2023 | 154 | 2023 |
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning X Zhu, R Zhang, B He, Z Guo, Z Zeng, Z Qin, S Zhang, P Gao ICCV 2023, 2023 | 126 | 2023 |
CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention Z Guo, R Zhang, L Qiu, X Ma, X Miao, X He, B Cui AAAI 2023 Oral, 2022 | 112 | 2022 |
Mathverse: Does your multi-modal llm truly see the diagrams in visual math problems? R Zhang, D Jiang, Y Zhang, H Lin, Z Guo, P Qiu, A Zhou, P Lu, KW Chang, ... ECCV 2024, 2024 | 108 | 2024 |
ImageBind-LLM: Multi-modality Instruction Tuning J Han, R Zhang, W Shao, P Gao, P Xu, H Xiao, K Zhang, C Liu, S Wen, ... arXiv preprint arXiv:2309.03905, 2023 | 107 | 2023 |
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following Z Guo, R Zhang, X Zhu, Y Tang, X Ma, J Han, K Chen, P Gao, X Li, H Li, ... arXiv preprint arXiv:2309.00615, 2023 | 104 | 2023 |
Parameter is Not All You Need: Starting from Non-parametric Networks for 3D Point Cloud Analysis R Zhang, L Wang, Z Guo, Y Wang, P Gao, H Li, J Shi CVPR 2023, 2023 | 102* | 2023 |
Can Language Understand Depth? R Zhang, Z Zeng, Z Guo, Y Li ACM MM 2022, 6868-6874, 2022 | 75 | 2022 |
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance Z Guo, Y Tang, R Zhang, D Wang, Z Wang, B Zhao, X Li ICCV 2023, 15372-15383, 2023 | 55* | 2023 |
Lidar-LLM: Exploring the Potential of Large Language Models for 3D Lidar Understanding S Yang, J Liu, R Zhang, M Pan, Z Guo, X Li, Z Chen, P Gao, Y Guo, ... AAAI 2025, 2023 | 53 | 2023 |
VT-CLIP: Enhancing Vision-Language Models with Visual-guided Texts L Qiu, R Zhang, Z Guo, Z Zeng, Y Li, G Zhang arXiv preprint arXiv:2112.02399, 2021 | 52 | 2021 |
Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud Pre-training Z Guo, R Zhang, L Qiu, X Li, PA Heng IJCAI 2023, 2023 | 51 | 2023 |
DS-Point: A Dual-Scale 3D Framework for Point Cloud Understanding R Zhang*, Z Zeng*, Z Guo*, B Chen, G Zhang, X Liu SMC 2023, 5046-5051, 2023 | 34* | 2023 |
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine R Zhang, X Wei, D Jiang, Z Guo, S Li, Y Zhang, C Tong, J Liu, A Zhou, ... ICLR 2025, 2024 | 31* | 2024 |
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation S Yan*, R Zhang*, Z Guo*, W Chen, W Zhang, H Li, Y Qiao, Z He, P Gao AAAI 2024, 2023 | 25 | 2023 |
Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis R Zhang, L Wang, Z Guo, J Shi WACV 2023, 1246-1255, 2023 | 20 | 2023 |
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation X Zhu, R Zhang, B He, Z Guo, J Liu, H Xiao, C Fu, H Dong, P Gao CVPR 2024, 2024 | 11* | 2024 |