A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions
Human activity recognition (HAR) is one of the most important and challenging problems in
the computer vision. It has critical application in wide variety of tasks including gaming …
the computer vision. It has critical application in wide variety of tasks including gaming …
Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges
Human activity recognition systems are developed as part of a framework to enable
continuous monitoring of human behaviours in the area of ambient assisted living, sports …
continuous monitoring of human behaviours in the area of ambient assisted living, sports …
Ms-tcn: Multi-stage temporal convolutional network for action segmentation
Temporally locating and classifying action segments in long untrimmed videos is of
particular interest to many applications like surveillance and robotics. While traditional …
particular interest to many applications like surveillance and robotics. While traditional …
Epic-fusion: Audio-visual temporal binding for egocentric action recognition
We focus on multi-modal fusion for egocentric action recognition, and propose a novel
architecture for multi-modal temporal-binding, ie the combination of modalities within a …
architecture for multi-modal temporal-binding, ie the combination of modalities within a …
Temporal convolutional networks for action segmentation and detection
The ability to identify and temporally segment fine-grained human actions throughout a
video is crucial for robotics, surveillance, education, and beyond. Typical approaches …
video is crucial for robotics, surveillance, education, and beyond. Typical approaches …
Ms-tcn++: Multi-stage temporal convolutional network for action segmentation
With the success of deep learning in classifying short trimmed videos, more attention has
been focused on temporally segmenting and classifying activities in long untrimmed videos …
been focused on temporally segmenting and classifying activities in long untrimmed videos …
Temporal convolutional networks: A unified approach to action segmentation
The dominant paradigm for video-based action segmentation is composed of two steps: first,
compute low-level features for each frame using Dense Trajectories or a Convolutional …
compute low-level features for each frame using Dense Trajectories or a Convolutional …
First-person hand action benchmark with rgb-d videos and 3d hand pose annotations
In this work we study the use of 3D hand poses to recognize first-person dynamic hand
actions interacting with 3D objects. Towards this goal, we collected RGB-D video sequences …
actions interacting with 3D objects. Towards this goal, we collected RGB-D video sequences …
In the eye of beholder: Joint learning of gaze and actions in first person video
We address the task of jointly determining what a person is doing and where they are
looking based on the analysis of video captured by a headworn camera. We propose a …
looking based on the analysis of video captured by a headworn camera. We propose a …
H2o: Two hands manipulating objects for first person interaction recognition
We present a comprehensive framework for egocentric interaction recognition using
markerless 3D annotations of two hands manipulating objects. To this end, we propose a …
markerless 3D annotations of two hands manipulating objects. To this end, we propose a …