Robustness-aware 3d object detection in autonomous driving: A review and outlook

Z Song, L Liu, F Jia, Y Luo, C Jia… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
In the realm of modern autonomous driving, the perception system is indispensable for
accurately assessing the state of the surrounding environment, thereby enabling informed …

Graphbev: Towards robust bev feature alignment for multi-modal 3d object detection

Z Song, L Yang, S Xu, L Liu, D Xu, C Jia, F Jia… - … on Computer Vision, 2024 - Springer
Integrating LiDAR and camera information into Bird's-Eye-View (BEV) representation has
emerged as a crucial aspect of 3D object detection in autonomous driving. However …

CoreNet: Conflict Resolution Network for point-pixel misalignment and sub-task suppression of 3D LiDAR-camera object detection

Y Li, Y Yang, Z Lei - Information Fusion, 2025 - Elsevier
Fusing multi-modality inputs from different sensors is an effective way to improve the
performance of 3D object detection. However, current methods overlook two important …

Contrastive Late Fusion for 3D Object Detection

T Zhang, Z Liang, Y Yang, X Yang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
In the field of autonomous driving, accurate and efficient 3D object detection is crucial for
ensuring safe and reliable operation. This paper focuses on the fusion of camera and LiDAR …

SeaDATE: Remedy Dual-Attention Transformer with Semantic Alignment via Contrast Learning for Multimodal Object Detection

S Dong, W **e, D Yang, J Tian, Y Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Multimodal object detection leverages diverse modal information to enhance the accuracy
and robustness of detectors. Due to its ability to capture long-range dependencies, the …

Tran-GCN: A Transformer-Enhanced Graph Convolutional Network for Person Re-Identification in Monitoring Videos

X Hong, T Adam, M Ghazali - arxiv preprint arxiv:2409.09391, 2024 - arxiv.org
Person Re-Identification (Re-ID) has gained popularity in computer vision, enabling cross-
camera pedestrian recognition. Although the development of deep learning has provided a …

DeepInteraction++: Multi-Modality Interaction for Autonomous Driving

Z Yang, N Song, W Li, X Zhu, L Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Existing top-performance autonomous driving systems typically rely on the multi-modal
fusion strategy for reliable scene understanding. This design is however fundamentally …

Ultra-FastNet: an end-to-end learnable network for multi-person posture prediction

T Peng, Y Luo, Z Ou, J Du, G Lin - The Journal of Supercomputing, 2024 - Springer
At present, the top-down approach requires the introduction of pedestrian detection
algorithms in multi-person pose estimation. In this paper, we propose an end-to-end …

ContextNet: Leveraging Comprehensive Contextual Information for Enhanced 3D Object Detection

C Pei, S Zhang, L Cao, L Zhao - IEEE Access, 2024 - ieeexplore.ieee.org
The progress in object detection for autonomous driving using LiDAR point cloud data has
been remarkable. However, current voxel-based two-stage detectors have not fully …

Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection

P Lyu, PH Yeung, X Cheng, X Yu, C Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Unmanned aerial vehicle (UAV)-based bi-modal salient object detection (BSOD) aims to
segment salient objects in a scene utilizing complementary cues in unaligned RGB and …