Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data
Recently, Self-Supervised Representation Learning (SSRL) has attracted much attention in
the field of computer vision, speech, natural language processing (NLP), and recently, with …
the field of computer vision, speech, natural language processing (NLP), and recently, with …
Self-supervised video-based action recognition with disturbances
Self-supervised video-based action recognition is a challenging task, which needs to extract
the principal information characterizing the action from content-diversified videos over large …
the principal information characterizing the action from content-diversified videos over large …
Data-efficient masked video modeling for self-supervised action recognition
Recently, self-supervised video representation learning based on Masked Video Modeling
(MVM) has demonstrated promising results for action recognition. However, existing …
(MVM) has demonstrated promising results for action recognition. However, existing …
Pose-based contrastive learning for domain agnostic activity representations
While recognition accuracies of video classification models trained on conventional
benchmarks are gradually saturating, recent studies raise alarm about the learned …
benchmarks are gradually saturating, recent studies raise alarm about the learned …
Comparing learning methodologies for self-supervised audio-visual representation learning
In recent years, the machine learning community has devoted an increasing attention to self-
supervised learning. The performance gap between supervised and self-supervised has …
supervised learning. The performance gap between supervised and self-supervised has …
GOCA: Guided online cluster assignment for self-supervised video representation learning
Clustering is a ubiquitous tool in unsupervised learning. Most of the existing self-supervised
representation learning methods typically cluster samples based on visually dominant …
representation learning methods typically cluster samples based on visually dominant …
Similarity contrastive estimation for image and video soft contrastive self-supervised learning
Contrastive representation learning has proven to be an effective self-supervised learning
method for images and videos. Most successful approaches are based on Noise Contrastive …
method for images and videos. Most successful approaches are based on Noise Contrastive …
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
L Vilaca, Y Yu, P Vinan - arxiv preprint arxiv:2412.00049, 2024 - arxiv.org
Audio-visual correlation learning aims to capture and understand natural phenomena
between audio and visual data. The rapid growth of Deep Learning propelled the …
between audio and visual data. The rapid growth of Deep Learning propelled the …
TabFedSL: A Self-Supervised Approach to Labeling Tabular Data in Federated Learning Environments
R Wang, Y Hu, Z Chen, J Guo, G Liu - Mathematics, 2024 - mdpi.com
Currently, self-supervised learning has shown effectiveness in solving data labeling issues.
Its success mainly depends on having access to large, high-quality datasets with diverse …
Its success mainly depends on having access to large, high-quality datasets with diverse …
About Time: Advances, Challenges, and Outlooks of Action Understanding
We have witnessed impressive advances in video action understanding. Increased dataset
sizes, variability, and computation availability have enabled leaps in performance and task …
sizes, variability, and computation availability have enabled leaps in performance and task …