RGB-T image analysis technology and application: A survey

K Song, Y Zhao, L Huang, Y Yan, Q Meng - Engineering Applications of …, 2023 - Elsevier
Abstract RGB-Thermal infrared (RGB-T) image analysis has been actively studied in recent
years. In the past decade, it has received wide attention and made a lot of important …

RGB-D and thermal sensor fusion: A systematic literature review

M Brenner, NH Reyes, T Susnjak, ALC Barczak - IEEE Access, 2023 - ieeexplore.ieee.org
In the last decade, the computer vision field has seen significant progress in multimodal data
fusion and learning, where multiple sensors, including depth, infrared, and visual, are used …

Attention-based generative adversarial network with internal damage segmentation using thermography

R Ali, YJ Cha - Automation in Construction, 2022 - Elsevier
This paper describes a real-time, high-performance deep-learning network to segment
internal damages of concrete members at the pixel level using active thermography. Unlike …

Attention mechanism exploits temporal contexts: Real-time 3d human pose reconstruction

R Liu, J Shen, H Wang, C Chen… - Proceedings of the …, 2020 - openaccess.thecvf.com
We propose a novel attention-based framework for 3D human pose estimation from a
monocular video. Despite the general success of end-to-end deep learning paradigms, our …

A novel visible-depth-thermal image dataset of salient object detection for robotic visual perception

K Song, J Wang, Y Bao, L Huang… - IEEE/ASME Transactions …, 2022 - ieeexplore.ieee.org
Visual perception plays an important role in industrial information field, especially in robotic
gras** application. In order to detect the object to be grasped quickly and accurately …

Synthetic data generation for end-to-end thermal infrared tracking

L Zhang, A Gonzalez-Garcia… - … on Image Processing, 2018 - ieeexplore.ieee.org
The usage of both off-the-shelf and end-to-end trained deep networks have significantly
improved the performance of visual tracking on RGB videos. However, the lack of large …

Crack detection of masonry structure based on thermal and visible image fusion and semantic segmentation

H Huang, Y Cai, C Zhang, Y Lu, A Hammad… - Automation in …, 2024 - Elsevier
The integration of visible and thermal images has demonstrated the potential ability to
enhance crack segmentation accuracy. However, due to the intricate texture of masonry …

Multi-modal recurrent attention networks for facial expression recognition

J Lee, S Kim, S Kim, K Sohn - IEEE Transactions on Image …, 2020 - ieeexplore.ieee.org
Recent deep neural networks based methods have achieved state-of-the-art performance
on various facial expression recognition tasks. Despite such progress, previous researches …

UniMod1K: towards a more universal large-scale dataset and benchmark for multi-modal learning

XF Zhu, T Xu, Z Liu, Z Tang, XJ Wu, J Kittler - International Journal of …, 2024 - Springer
The emergence of large-scale high-quality datasets has stimulated the rapid development of
deep learning in recent years. However, most computer vision tasks focus on the visual …

Cross-modality person re-identification via multi-task learning

N Huang, K Liu, Y Liu, Q Zhang, J Han - Pattern Recognition, 2022 - Elsevier
Despite its promising preliminary results, existing cross-modality Visible-Infrared Person Re-
IDentification (VI-PReID) models incorporating semantic (person) masks simply use these …