Multi-view action recognition using contrastive learning

K Shah, A Shah, CP Lau… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
In this work, we present a method for RGB-based action recognition using multi-view videos.
We present a supervised contrastive learning framework to learn a feature embedding …

Mitigating and evaluating static bias of action representations in the background and the foreground

H Li, Y Liu, H Zhang, B Li - Proceedings of the IEEE/CVF …, 2023‏ - openaccess.thecvf.com
In video action recognition, shortcut static features can interfere with the learning of motion
features, resulting in poor out-of-distribution (OOD) generalization. The video background is …

Uncovering the hidden dynamics of video self-supervised learning under distribution shifts

P Sarkar, A Beirami, A Etemad - Advances in Neural …, 2023‏ - proceedings.neurips.cc
Video self-supervised learning (VSSL) has made significant progress in recent years.
However, the exact behavior and dynamics of these models under different forms of …

Enabling detailed action recognition evaluation through video dataset augmentation

J Chung, Y Wu, O Russakovsky - Advances in Neural …, 2022‏ - proceedings.neurips.cc
It is well-known in the video understanding community that human action recognition models
suffer from background bias, ie, over-relying on scene cues in making their predictions …

Enhancing motion visual cues for self-supervised video representation learning

M Nie, Z Quan, W Ding, W Yang - Engineering Applications of Artificial …, 2023‏ - Elsevier
Building the general feature from unlabeled videos is the core of self-supervised video
representation learning. However, recent research on video representation focuses on static …

Attentive spatial-temporal contrastive learning for self-supervised video representation

X Yang, S **ong, K Wu, D Shan, Z **e - Image and Vision Computing, 2023‏ - Elsevier
Most existing self-supervised works learn video representation by using a single pretext
task. A single pretext task, providing single supervision from unlabeled data, may neglect to …

Frequency selective augmentation for video representation learning

J Kim, T Kim, M Shim, D Han, D Wee… - Proceedings of the AAAI …, 2023‏ - ojs.aaai.org
Recent self-supervised video representation learning methods focus on maximizing the
similarity between multiple augmented views from the same video and largely rely on the …

[PDF][PDF] Unifying Video Self-Supervised Learning across Families of Tasks: A Survey

I Dave, M Gunawardhana, L Sadith, H Zhou, L David… - Preprints, 2024‏ - preprints.org
Video self-supervised learning (VideoSSL) offers significant potential for reducing
annotation costs and enhancing a wide range of downstream tasks in video understanding …