- Academic Search

Y Ma, T Wang, X Bai, H Yang, Y Hou… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

In recent years, vision-centric Bird's Eye View (BEV) perception has garnered significant
interest from both industry and academia due to its inherent advantages, such as providing …

Zapisz Cytuj Cytowane przez 134 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]

[PDF] arxiv.org

Grounded sam: Assembling open-world models for diverse visual tasks

T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to
combine with the segment anything model (SAM). This integration enables the detection and …

Zapisz Cytuj Cytowane przez 240 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Taptr: Tracking any point with transformers as detection

H Li, H Zhang, S Liu, Z Zeng, T Ren, F Li… - European Conference on …, 2024 - Springer

In this paper, we propose a simple yet effective approach for Tracking Any Point with
TRansformers (TAPTR). Based on the observation that point tracking bears a great …

Zapisz Cytuj Cytowane przez 14 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]

[PDF] arxiv.org

Open: Object-wise position embedding for multi-view 3d object detection

J Hou, T Wang, X Ye, Z Liu, S Gong, X Tan… - … on Computer Vision, 2024 - Springer

Accurate depth information is crucial for enhancing the performance of multi-view 3D object
detection. Despite the success of some existing multi-view 3D detectors utilizing pixel-wise …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 9

[Free GPT-4]

[PDF] arxiv.org

Context and geometry aware voxel transformer for semantic scene completion

Z Yu, R Zhang, J Ying, J Yu, X Hu, L Luo… - arxiv preprint arxiv …, 2024 - arxiv.org

Vision-based Semantic Scene Completion (SSC) has gained much attention due to its
widespread applications in various 3D perception tasks. Existing sparse-to-dense …

Zapisz Cytuj Cytowane przez 7 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] thecvf.com

Multi-View Attentive Contextualization for Multi-View 3D Object Detection

X Liu, C Zheng, M Qian, N Xue… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract We present Multi-View Attentive Contextualization (MvACon) a simple yet effective
method for improving 2D-to-3D feature lifting in query-based multi-view 3D (MV3D) object …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

LH Chen, S Lu, A Zeng, H Zhang, B Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

This study delves into the realm of multi-modality (ie, video and motion modalities) human
behavior understanding by leveraging the powerful capabilities of Large Language Models …

Zapisz Cytuj Cytowane przez 22 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] thecvf.com

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

H Ji, P Liang, E Cheng - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Multi-camera-based 3D object detection has made notable progress in the past several
years. However we observe that there are cases (eg faraway regions) in which popular 2D …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

LinkOcc: 3D Semantic Occupancy Prediction with Temporal Association

W Ouyang, Z Xu, B Shen, J Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

3D semantic occupancy has garnered considerable attention due to its abundant structural
information encompassing the entire autonomous driving scene. However, existing 3D …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły

[Free GPT-4]

[PDF] arxiv.org

CoreNet: Conflict Resolution Network for point-pixel misalignment and sub-task suppression of 3D LiDAR-camera object detection

Y Li, Y Yang, Z Lei - Information Fusion, 2025 - Elsevier

Fusing multi-modality inputs from different sensors is an effective way to improve the
performance of 3D object detection. However, current methods overlook two important …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Dfa3d: 3d deformable attention for 2d-to-3d feature lifting

Vision-centric bev perception: A survey

Grounded sam: Assembling open-world models for diverse visual tasks

Taptr: Tracking any point with transformers as detection

Open: Object-wise position embedding for multi-view 3d object detection

Context and geometry aware voxel transformer for semantic scene completion

Multi-View Attentive Contextualization for Multi-View 3D Object Detection

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

LinkOcc: 3D Semantic Occupancy Prediction with Temporal Association

CoreNet: Conflict Resolution Network for point-pixel misalignment and sub-task suppression of 3D LiDAR-camera object detection