Transformer for skeleton-based action recognition: A review of recent advances
Skeleton-based action recognition has rapidly become one of the most popular and
essential research topics in computer vision. The task is to analyze the characteristics of …
essential research topics in computer vision. The task is to analyze the characteristics of …
Graph convolutional neural network for human action recognition: A comprehensive survey
Video-based human action recognition is one of the most important and challenging areas
of research in the field of computer vision. Human action recognition has found many …
of research in the field of computer vision. Human action recognition has found many …
Flowformer: A transformer architecture for optical flow
We introduce optical Flow transFormer, dubbed as FlowFormer, a transformer-based neural
network architecture for learning optical flow. FlowFormer tokenizes the 4D cost volume built …
network architecture for learning optical flow. FlowFormer tokenizes the 4D cost volume built …
Flowformer++: Masked cost volume autoencoding for pretraining optical flow estimation
FlowFormer introduces a transformer architecture into optical flow estimation and achieves
state-of-the-art performance. The core component of FlowFormer is the transformer-based …
state-of-the-art performance. The core component of FlowFormer is the transformer-based …
X3d: Expanding architectures for efficient video recognition
C Feichtenhofer - Proceedings of the IEEE/CVF conference …, 2020 - openaccess.thecvf.com
This paper presents X3D, a family of efficient video networks that progressively expand a
tiny 2D image classification architecture along multiple network axes, in space, time, width …
tiny 2D image classification architecture along multiple network axes, in space, time, width …
Vision-based human activity recognition: a survey
Human activity recognition (HAR) systems attempt to automatically identify and analyze
human activities using acquired information from various types of sensors. Although several …
human activities using acquired information from various types of sensors. Although several …
Real-time intermediate flow estimation for video frame interpolation
Real-time video frame interpolation (VFI) is very useful in video processing, media players,
and display devices. We propose RIFE, a Real-time Intermediate Flow Estimation algorithm …
and display devices. We propose RIFE, a Real-time Intermediate Flow Estimation algorithm …
Craft: Cross-attentional flow transformer for robust optical flow
Optical flow estimation aims to find the 2D motion field by identifying corresponding pixels
between two images. Despite the tremendous progress of deep learning-based optical flow …
between two images. Despite the tremendous progress of deep learning-based optical flow …
Stm: Spatiotemporal and motion encoding for action recognition
Spatiotemporal and motion features are two complementary and crucial information for
video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn …
video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn …
A comprehensive study of deep video action recognition
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …
last decade, we have witnessed great advancements in video action recognition thanks to …