Human action recognition from various data modalities: A review

Z Sun, Q Ke, H Rahmani, M Bennamoun… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Human Action Recognition (HAR) aims to understand human behavior and assign a label to
each action. It has a wide range of applications, and therefore has been attracting increasing …

Temporal action segmentation: An analysis of modern techniques

G Ding, F Sener, A Yao - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Temporal action segmentation (TAS) in videos aims at densely identifying video frames in
minutes-long videos with multiple action classes. As a long-range video understanding task …

Bedlam: A synthetic dataset of bodies exhibiting detailed lifelike animated motion

MJ Black, P Patel, J Tesch… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We show, for the first time, that neural networks trained only on synthetic data achieve state-
of-the-art accuracy on the problem of 3D human pose and shape (HPS) estimation from real …

Google scanned objects: A high-quality dataset of 3d scanned household items

L Downs, A Francis, N Koenig, B Kinman… - … on Robotics and …, 2022 - ieeexplore.ieee.org
Interactive 3D simulations have enabled break-throughs in robotics and computer vision, but
simulating the broad diversity of environments needed for deep learning requires large …

Assembly101: A large-scale multi-view video dataset for understanding procedural activities

F Sener, D Chatterjee, D Shelepov… - Proceedings of the …, 2022 - openaccess.thecvf.com
Assembly101 is a new procedural activity dataset featuring 4321 videos of people
assembling and disassembling 101" take-apart" toy vehicles. Participants work without fixed …

Egoexolearn: A dataset for bridging asynchronous ego-and exo-centric view of procedural activities in real world

Y Huang, G Chen, J Xu, M Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Being able to map the activities of others into one's own point of view is one fundamental
human skill even from a very early age. Taking a step toward understanding this human …

Error detection in egocentric procedural task videos

SP Lee, Z Lu, Z Zhang, M Hoai… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present a new egocentric procedural error dataset containing videos with various types
of errors as well as normal videos and propose a new framework for procedural error …

Learning fine-grained view-invariant representations from unpaired ego-exo videos via temporal alignment

ZS Xue, K Grauman - Advances in Neural Information …, 2023 - proceedings.neurips.cc
The egocentric and exocentric viewpoints of a human activity look dramatically different, yet
invariant representations to link them are essential for many potential applications in …

Progressive instance-aware feature learning for compositional action recognition

R Yan, L **e, X Shu, L Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
In order to enable the model to generalize to unseen “action-objects”(compositional action),
previous methods encode multiple pieces of information (ie, the appearance, position, and …

Learning by aligning videos in time

S Haresh, S Kumar, H Coskun… - Proceedings of the …, 2021 - openaccess.thecvf.com
We present a self-supervised approach for learning video representations using temporal
video alignment as a pretext task, while exploiting both frame-level and video-level …