Human action recognition from various data modalities: A review

Z Sun, Q Ke, H Rahmani, M Bennamoun… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Human Action Recognition (HAR) aims to understand human behavior and assign a label to
each action. It has a wide range of applications, and therefore has been attracting increasing …

A survey on deep learning for human activity recognition

F Gu, MH Chung, M Chignell, S Valaee… - ACM Computing …, 2021 - dl.acm.org
Human activity recognition is a key to a lot of applications such as healthcare and smart
home. In this study, we provide a comprehensive survey on recent advances and challenges …

Mcvd-masked conditional video diffusion for prediction, generation, and interpolation

V Voleti, A Jolicoeur-Martineau… - Advances in neural …, 2022 - proceedings.neurips.cc
Video prediction is a challenging task. The quality of video frames from current state-of-the-
art (SOTA) generative models tends to be poor and generalization beyond the training data …

Simvp: Simpler yet better video prediction

Z Gao, C Tan, L Wu, SZ Li - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …

Diffusion probabilistic modeling for video generation

R Yang, P Srivastava, S Mandt - Entropy, 2023 - mdpi.com
Denoising diffusion probabilistic models are a promising new class of generative models
that mark a milestone in high-quality image generation. This paper showcases their ability to …

Predrnn: A recurrent neural network for spatiotemporal predictive learning

Y Wang, H Wu, J Zhang, Z Gao, J Wang… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org
The predictive learning of spatiotemporal sequences aims to generate future images by
learning from the historical context, where the visual dynamics are believed to have modular …

Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit

T Zhou, X Lin, J Wu, Y Chen, H **e, Y Li, J Fan, H Wu… - Nature …, 2021 - nature.com
There is an ever-growing demand for artificial intelligence. Optical processors, which
compute with photons instead of electrons, can fundamentally accelerate the development …

Vision-based human activity recognition: a survey

DR Beddiar, B Nini, M Sabokrou, A Hadid - Multimedia Tools and …, 2020 - Springer
Human activity recognition (HAR) systems attempt to automatically identify and analyze
human activities using acquired information from various types of sensors. Although several …

Action transformer: A self-attention model for short-time pose-based human action recognition

V Mazzia, S Angarano, F Salvetti, F Angelini… - Pattern Recognition, 2022 - Elsevier
Deep neural networks based purely on attention have been successful across several
domains, relying on minimal architectural priors from the designer. In Human Action …

Languagebind: Extending video-language pretraining to n-modality by language-based semantic alignment

B Zhu, B Lin, M Ning, Y Yan, J Cui, HF Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
The video-language (VL) pretraining has achieved remarkable improvement in multiple
downstream tasks. However, the current VL pretraining framework is hard to extend to …