[HTML][HTML] Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues

A Gupta, A Anpalagan, L Guan, AS Khwaja - Array, 2021 - Elsevier
This article presents a comprehensive survey of deep learning applications for object
detection and scene perception in autonomous vehicles. Unlike existing review papers, we …

3D Human Action Recognition: Through the eyes of researchers

A Sarkar, A Banerjee, PK Singh, R Sarkar - Expert Systems with …, 2022 - Elsevier
Abstract Human Action Recognition (HAR) has remained one of the most challenging tasks
in computer vision. With the surge in data-driven methodologies, the depth modality has …

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

K Bayoudh, R Knani, F Hamdaoui, A Mtibaa - The Visual Computer, 2022 - Springer
The research progress in multimodal learning has grown rapidly over the last decade in
several areas, especially in computer vision. The growing potential of multimodal data …

Object detection recognition and robot gras** based on machine learning: A survey

Q Bai, S Li, J Yang, Q Song, Z Li, X Zhang - IEEE access, 2020 - ieeexplore.ieee.org
With the rapid development of machine learning, its powerful function in the machine vision
field is increasingly reflected. The combination of machine vision and robotics to achieve the …

Gesture recognition based on multi‐modal feature weight

H Duan, Y Sun, W Cheng, D Jiang… - Concurrency and …, 2021 - Wiley Online Library
With the continuous development of sensor technology, the acquisition cost of RGB‐D
images is getting lower and lower, and gesture recognition based on depth images and Red …

P4contrast: Contrastive learning with pairs of point-pixel pairs for rgb-d scene understanding

Y Liu, L Yi, S Zhang, Q Fan, T Funkhouser… - ar** of objects with uncertain information: A review
C Wang, X Zhang, X Zang, Y Liu, G Ding, W Yin, J Zhao - Sensors, 2020 - mdpi.com
As there come to be more applications of intelligent robots, their task object is becoming
more varied. However, it is still a challenge for a robot to handle unfamiliar objects. We …

CCAFFMNet: Dual-spectral semantic segmentation network with channel-coordinate attention feature fusion module

S Yi, J Li, X Liu, X Yuan - Neurocomputing, 2022 - Elsevier
Abstract Dual-spectral (RGB-thermal) semantic segmentation is a fundamental task for
visual perception of autonomous driving in harsh imaging environments (such as darkness …

RGB-D fusion models for construction and demolition waste detection

J Li, H Fang, L Fan, J Yang, T Ji, Q Chen - Waste Management, 2022 - Elsevier
The development of urbanization has brought a large amount of construction and demolition
waste (CDW), which occupy land and cause adverse ecological effects. To effectively solve …

Deep Multimodal Data Fusion

F Zhao, C Zhang, B Geng - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal Artificial Intelligence (Multimodal AI), in general, involves various types of data
(eg, images, texts, or data collected from different sensors), feature engineering (eg …