A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

C Chen, Y Wu, Q Dai, HY Zhou, M Xu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …

Deep learning-based action detection in untrimmed videos: A survey

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org
Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

Videocomposer: Compositional video synthesis with motion controllability

X Wang, H Yuan, S Zhang, D Chen… - Advances in …, 2023 - proceedings.neurips.cc
The pursuit of controllability as a higher standard of visual content creation has yielded
remarkable progress in customizable image synthesis. However, achieving controllable …

Tridet: Temporal action detection with relative boundary modeling

D Shi, Y Zhong, Q Cao, L Ma, J Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we present a one-stage framework TriDet for temporal action detection.
Existing methods often suffer from imprecise boundary predictions due to the ambiguous …

Actionformer: Localizing moments of actions with transformers

CL Zhang, J Wu, Y Li - European Conference on Computer Vision, 2022 - Springer
Self-attention based Transformer models have demonstrated impressive results for image
classification and object detection, and more recently for video understanding. Inspired by …

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

Msr-gcn: Multi-scale residual graph convolution networks for human motion prediction

L Dang, Y Nie, C Long, Q Zhang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Human motion prediction is a challenging task due to the stochasticity and aperiodicity of
future poses. Recently, graph convolutional network has been proven to be very effective to …

Learning salient boundary feature for anchor-free temporal action localization

C Lin, C Xu, D Luo, Y Wang, Y Tai… - Proceedings of the …, 2021 - openaccess.thecvf.com
Temporal action localization is an important yet challenging task in video understanding.
Typically, such a task aims at inferring both the action category and localization of the start …

End-to-end temporal action detection with transformer

X Liu, Q Wang, Y Hu, X Tang, S Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Temporal action detection (TAD) aims to determine the semantic label and the temporal
interval of every action instance in an untrimmed video. It is a fundamental and challenging …

G-tad: Sub-graph localization for temporal action detection

M Xu, C Zhao, DS Rojas, A Thabet… - Proceedings of the …, 2020 - openaccess.thecvf.com
Temporal action detection is a fundamental yet challenging task in video understanding.
Video context is a critical cue to effectively detect actions, but current works mainly focus on …