Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data

S Deldari, H Xue, A Saeed, J He, DV Smith… - arxiv preprint arxiv …, 2022 - arxiv.org
Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in
the field of computer vision, speech, natural language processing (NLP), and recently, with …

Self-supervised video-based action recognition with disturbances

W Lin, X Ding, Y Huang, H Zeng - IEEE Transactions on Image …, 2023 - ieeexplore.ieee.org
Self-supervised video-based action recognition is a challenging task, which needs to extract
the principal information characterizing the action from content-diversified videos over large …

Data-efficient masked video modeling for self-supervised action recognition

Q Li, X Huang, Z Wan, L Hu, S Wu, J Zhang… - Proceedings of the 31st …, 2023 - dl.acm.org
Recently, self-supervised video representation learning based on Masked Video Modeling
(MVM) has demonstrated promising results for action recognition. However, existing …

Pose-based contrastive learning for domain agnostic activity representations

D Schneider, S Sarfraz, A Roitberg… - Proceedings of the …, 2022 - openaccess.thecvf.com
While recognition accuracies of video classification models trained on conventional
benchmarks are gradually saturating, recent studies raise alarm about the learned …

Comparing learning methodologies for self-supervised audio-visual representation learning

H Terbouche, L Schoneveld, O Benson… - IEEE Access, 2022 - ieeexplore.ieee.org
In recent years, the machine learning community has devoted an increasing attention to self-
supervised learning. The performance gap between supervised and self-supervised has …

GOCA: Guided online cluster assignment for self-supervised video representation learning

H Coskun, A Zareian, JL Moore, F Tombari… - European Conference on …, 2022 - Springer
Clustering is a ubiquitous tool in unsupervised learning. Most of the existing self-supervised
representation learning methods typically cluster samples based on visually dominant …

Similarity contrastive estimation for image and video soft contrastive self-supervised learning

J Denize, J Rabarisoa, A Orcesi, R Hérault - Machine Vision and …, 2023 - Springer
Contrastive representation learning has proven to be an effective self-supervised learning
method for images and videos. Most successful approaches are based on Noise Contrastive …

A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning

L Vilaca, Y Yu, P Vinan - arxiv preprint arxiv:2412.00049, 2024 - arxiv.org
Audio-visual correlation learning aims to capture and understand natural phenomena
between audio and visual data. The rapid growth of Deep Learning propelled the …

TabFedSL: A Self-Supervised Approach to Labeling Tabular Data in Federated Learning Environments

R Wang, Y Hu, Z Chen, J Guo, G Liu - Mathematics, 2024 - mdpi.com
Currently, self-supervised learning has shown effectiveness in solving data labeling issues.
Its success mainly depends on having access to large, high-quality datasets with diverse …

About Time: Advances, Challenges, and Outlooks of Action Understanding

A Stergiou, R Poppe - arxiv preprint arxiv:2411.15106, 2024 - arxiv.org
We have witnessed impressive advances in video action understanding. Increased dataset
sizes, variability, and computation availability have enabled leaps in performance and task …