A comprehensive survey of vision-based human action recognition methods

HB Zhang, YX Zhang, B Zhong, Q Lei, L Yang, JX Du… - Sensors, 2019 - mdpi.com
Although widely used in many applications, accurate and efficient human action recognition
remains a challenging area of research in the field of computer vision. Most recent surveys …

A comprehensive survey of scene graphs: Generation and application

X Chang, P Ren, P Xu, Z Li, X Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Scene graph is a structured representation of a scene that can clearly express the objects,
attributes, and relationships between objects in the scene. As computer vision technology …

Clip-event: Connecting text and images with event structures

M Li, R Xu, S Wang, L Zhou, X Lin… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Vision-language (V+ L) pretraining models have achieved great success in
supporting multimedia applications by understanding the alignments between images and …

Reconstructing hands in 3d with transformers

G Pavlakos, D Shan, I Radosavovic… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present an approach that can reconstruct hands in 3D from monocular input. Our
approach for Hand Mesh Recovery HaMeR follows a fully transformer-based architecture …

Learning human-object interactions by graph parsing neural networks

S Qi, W Wang, B Jia, J Shen… - Proceedings of the …, 2018 - openaccess.thecvf.com
This paper addresses the task of detecting and recognizing human-object interactions (HOI)
in images and videos. We introduce the Graph Parsing Neural Network (GPNN), a …

Drg: Dual relation graph for human-object interaction detection

C Gao, J Xu, Y Zou, JB Huang - … Conference, Glasgow, UK, August 23–28 …, 2020 - Springer
We tackle the challenging problem of human-object interaction (HOI) detection. Existing
methods either recognize the interaction of each human-object pair in isolation or perform …

Neural motifs: Scene graph parsing with global context

R Zellers, M Yatskar, S Thomson… - Proceedings of the …, 2018 - openaccess.thecvf.com
We investigate the problem of producing structured graph representations of visual scenes.
Our work analyzes the role of motifs: regularly appearing substructures in scene graphs. We …

Understanding human hands in contact at internet scale

D Shan, J Geng, M Shu… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com
Hands are the central means by which humans manipulate their world and being able to
reliably extract hand state information from Internet videos of humans engaged in their …

Ava: A video dataset of spatio-temporally localized atomic visual actions

C Gu, C Sun, DA Ross, C Vondrick… - Proceedings of the …, 2018 - openaccess.thecvf.com
This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions
(AVA). The AVA dataset densely annotates 80 atomic visual actions in 437 15-minute video …

Scene graph generation by iterative message passing

D Xu, Y Zhu, CB Choy, L Fei-Fei - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com
Understanding a visual scene goes beyond recognizing individual objects in isolation.
Relationships between objects also constitute rich semantic information about the scene. In …