Mapdistill: Boosting efficient camera-based hd map construction via camera-lidar fusion model distillation

X Hao, R Li, H Zhang, D Li, R Yin, S Jung… - … on Computer Vision, 2024 - Springer
Online high-definition (HD) map construction is an important and challenging task in
autonomous driving. Recently, there has been a growing interest in cost-effective multi-view …

4d contrastive superflows are dense 3d representation learners

X Xu, L Kong, H Shuai, W Zhang, L Pan, K Chen… - … on Computer Vision, 2024 - Springer
In the realm of autonomous driving, accurate 3D perception is the foundation. However,
develo** such models relies on extensive human annotations–a process that is both …

[HTML][HTML] See the Unseen: Grid-Wise Drivable Area Detection Dataset and Network Using LiDAR

CR Goenawan, DH Paek, SH Kong - Remote Sensing, 2024 - mdpi.com
Drivable Area (DA) detection is crucial for autonomous driving. Camera-based methods rely
heavily on illumination conditions and often fail to capture accurate 3D information, while …

UniDrive: Towards Universal Driving Perception Across Camera Configurations

Y Li, W Zheng, X Huang, K Keutzer - arxiv preprint arxiv:2410.13864, 2024 - arxiv.org
Vision-centric autonomous driving has demonstrated excellent performance with
economical sensors. As the fundamental step, 3D perception aims to infer 3D information …

MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception

X Hao, G Liu, Y Zhao, Y Ji, M Wei, H Zhao… - arxiv preprint arxiv …, 2025 - arxiv.org
Multi-sensor fusion models play a crucial role in autonomous driving perception, particularly
in tasks like 3D object detection and HD map construction. These models provide essential …

LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes

X Xu, L Kong, H Shuai, L Pan, Z Liu, Q Liu - arxiv preprint arxiv …, 2025 - arxiv.org
LiDAR data pretraining offers a promising approach to leveraging large-scale, readily
available datasets for enhanced data utilization. However, existing methods predominantly …

GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency

D Lu, L Kong, T Huang, GH Lee - arxiv preprint arxiv:2412.09511, 2024 - arxiv.org
Identifying affordance regions on 3D objects from semantic cues is essential for robotics and
human-machine interaction. However, existing 3D affordance learning methods struggle …

LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving

L Kong, X Xu, Y Liu, J Cen, R Chen, W Zhang… - arxiv preprint arxiv …, 2025 - arxiv.org
Recent advancements in vision foundation models (VFMs) have revolutionized visual
perception in 2D, yet their potential for 3D scene understanding, particularly in autonomous …

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

S **e, L Kong, Y Dong, C Sima, W Zhang… - arxiv preprint arxiv …, 2025 - arxiv.org
Recent advancements in Vision-Language Models (VLMs) have sparked interest in their use
for autonomous driving, particularly in generating interpretable driving decisions through …

KALAHash: Knowledge-Anchored Low-Resource Adaptation for Deep Hashing

S Zhao, T Yu, X Hao, W Ma, V Narayanan - arxiv preprint arxiv …, 2024 - arxiv.org
Deep hashing has been widely used for large-scale approximate nearest neighbor search
due to its storage and search efficiency. However, existing deep hashing methods …