A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions

SK Yadav, K Tiwari, HM Pandey, SA Akbar - Knowledge-Based Systems, 2021 - Elsevier
Human activity recognition (HAR) is one of the most important and challenging problems in
the computer vision. It has critical application in wide variety of tasks including gaming …

Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges

HF Nweke, YW Teh, MA Al-Garadi, UR Alo - Expert Systems with …, 2018 - Elsevier
Human activity recognition systems are developed as part of a framework to enable
continuous monitoring of human behaviours in the area of ambient assisted living, sports …

Ms-tcn: Multi-stage temporal convolutional network for action segmentation

YA Farha, J Gall - Proceedings of the IEEE/CVF conference …, 2019 - openaccess.thecvf.com
Temporally locating and classifying action segments in long untrimmed videos is of
particular interest to many applications like surveillance and robotics. While traditional …

Epic-fusion: Audio-visual temporal binding for egocentric action recognition

E Kazakos, A Nagrani, A Zisserman… - Proceedings of the …, 2019 - openaccess.thecvf.com
We focus on multi-modal fusion for egocentric action recognition, and propose a novel
architecture for multi-modal temporal-binding, ie the combination of modalities within a …

Temporal convolutional networks for action segmentation and detection

C Lea, MD Flynn, R Vidal, A Reiter… - proceedings of the …, 2017 - openaccess.thecvf.com
The ability to identify and temporally segment fine-grained human actions throughout a
video is crucial for robotics, surveillance, education, and beyond. Typical approaches …

Ms-tcn++: Multi-stage temporal convolutional network for action segmentation

S Li, YA Farha, Y Liu, MM Cheng… - IEEE transactions on …, 2020 - ieeexplore.ieee.org
With the success of deep learning in classifying short trimmed videos, more attention has
been focused on temporally segmenting and classifying activities in long untrimmed videos …

Temporal convolutional networks: A unified approach to action segmentation

C Lea, R Vidal, A Reiter, GD Hager - … , The Netherlands, October 8-10 and …, 2016 - Springer
The dominant paradigm for video-based action segmentation is composed of two steps: first,
compute low-level features for each frame using Dense Trajectories or a Convolutional …

First-person hand action benchmark with rgb-d videos and 3d hand pose annotations

G Garcia-Hernando, S Yuan… - Proceedings of the …, 2018 - openaccess.thecvf.com
In this work we study the use of 3D hand poses to recognize first-person dynamic hand
actions interacting with 3D objects. Towards this goal, we collected RGB-D video sequences …

In the eye of beholder: Joint learning of gaze and actions in first person video

Y Li, M Liu, JM Rehg - Proceedings of the European …, 2018 - openaccess.thecvf.com
We address the task of jointly determining what a person is doing and where they are
looking based on the analysis of video captured by a headworn camera. We propose a …

H2o: Two hands manipulating objects for first person interaction recognition

T Kwon, B Tekin, J Stühmer, F Bogo… - Proceedings of the …, 2021 - openaccess.thecvf.com
We present a comprehensive framework for egocentric interaction recognition using
markerless 3D annotations of two hands manipulating objects. To this end, we propose a …