Human action recognition from various data modalities: A review
Human Action Recognition (HAR) aims to understand human behavior and assign a label to
each action. It has a wide range of applications, and therefore has been attracting increasing …
each action. It has a wide range of applications, and therefore has been attracting increasing …
A survey on deep learning for human activity recognition
Human activity recognition is a key to a lot of applications such as healthcare and smart
home. In this study, we provide a comprehensive survey on recent advances and challenges …
home. In this study, we provide a comprehensive survey on recent advances and challenges …
Mcvd-masked conditional video diffusion for prediction, generation, and interpolation
Video prediction is a challenging task. The quality of video frames from current state-of-the-
art (SOTA) generative models tends to be poor and generalization beyond the training data …
art (SOTA) generative models tends to be poor and generalization beyond the training data …
Simvp: Simpler yet better video prediction
Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …
Diffusion probabilistic modeling for video generation
Denoising diffusion probabilistic models are a promising new class of generative models
that mark a milestone in high-quality image generation. This paper showcases their ability to …
that mark a milestone in high-quality image generation. This paper showcases their ability to …
Predrnn: A recurrent neural network for spatiotemporal predictive learning
The predictive learning of spatiotemporal sequences aims to generate future images by
learning from the historical context, where the visual dynamics are believed to have modular …
learning from the historical context, where the visual dynamics are believed to have modular …
Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit
There is an ever-growing demand for artificial intelligence. Optical processors, which
compute with photons instead of electrons, can fundamentally accelerate the development …
compute with photons instead of electrons, can fundamentally accelerate the development …
Vision-based human activity recognition: a survey
Human activity recognition (HAR) systems attempt to automatically identify and analyze
human activities using acquired information from various types of sensors. Although several …
human activities using acquired information from various types of sensors. Although several …
Action transformer: A self-attention model for short-time pose-based human action recognition
Deep neural networks based purely on attention have been successful across several
domains, relying on minimal architectural priors from the designer. In Human Action …
domains, relying on minimal architectural priors from the designer. In Human Action …
Languagebind: Extending video-language pretraining to n-modality by language-based semantic alignment
The video-language (VL) pretraining has achieved remarkable improvement in multiple
downstream tasks. However, the current VL pretraining framework is hard to extend to …
downstream tasks. However, the current VL pretraining framework is hard to extend to …