A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

C Chen, Y Wu, Q Dai, HY Zhou, M Xu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …

Deep learning-based perception systems for autonomous driving: A comprehensive survey

LH Wen, KH Jo - Neurocomputing, 2022 - Elsevier
With the rapid development of society and the economy, autonomous driving techniques are
widely applied in many areas, such as autonomous vehicles, autonomous drones, and …

Octformer: Octree-based transformers for 3d point clouds

PS Wang - ACM Transactions on Graphics (TOG), 2023 - dl.acm.org
We propose octree-based transformers, named OctFormer, for 3D point cloud learning.
OctFormer can not only serve as a general and effective backbone for 3D point cloud …

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD conference …, 2022 - dl.acm.org
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

An end-to-end transformer model for 3d object detection

I Misra, R Girdhar, A Joulin - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
We propose 3DETR, an end-to-end Transformer based object detection model for 3D point
clouds. Compared to existing detection methods that employ a number of 3D-specific …

Surface representation for point clouds

H Ran, J Liu, C Wang - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Most prior work represents the shapes of point clouds by coordinates. However, it is
insufficient to describe the local geometry directly. In this paper, we present RepSurf …

CCTSDB 2021: a more comprehensive traffic sign detection benchmark

J Zhang, X Zou, LD Kuang, J Wang… - Human-centric …, 2022 - centaur.reading.ac.uk
Traffic signs are one of the most important information that guide cars to travel, and the
detection of traffic signs is an important component of autonomous driving and intelligent …

Multimodal token fusion for vision transformers

Y Wang, X Chen, L Cao, W Huang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Many adaptations of transformers have emerged to address the single-modal vision tasks,
where self-attention modules are stacked to handle input sources like images. Intuitively …

Group-free 3d object detection via transformers

Z Liu, Z Zhang, Y Cao, H Hu… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Recently, directly detecting 3D objects from 3D point clouds has received increasing
attention. To extract object representation from an irregular point cloud, existing methods …

Self-supervised pretraining of 3d features on any point-cloud

Z Zhang, R Girdhar, A Joulin… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Pretraining on large labeled datasets is a prerequisite to achieve good performance in many
computer vision tasks like image recognition, video understanding etc. However, pretraining …