Anticipative video transformer
Abstract We propose Anticipative Video Transformer (AVT), an end-to-end attention-based
video modeling architecture that attends to the previously observed video in order to …
video modeling architecture that attends to the previously observed video in order to …
Self-supervised visual feature learning with deep neural networks: A survey
Large-scale labeled data are generally required to train deep neural networks in order to
obtain better performance in visual feature learning from images or videos for computer …
obtain better performance in visual feature learning from images or videos for computer …
Self-supervised learning by cross-modal audio-video clustering
Visual and audio modalities are highly correlated, yet they contain different information.
Their strong correlation makes it possible to predict the semantics of one from the other with …
Their strong correlation makes it possible to predict the semantics of one from the other with …
Self-supervised learning for medical image analysis using image context restoration
Abstract Machine learning, particularly deep learning has boosted medical image analysis
over the past years. Training a good model based on deep learning requires large amount …
over the past years. Training a good model based on deep learning requires large amount …
Memory-augmented dense predictive coding for video representation learning
The objective of this paper is self-supervised learning from video, in particular for
representations for action recognition. We make the following contributions:(i) We propose a …
representations for action recognition. We make the following contributions:(i) We propose a …
Video representation learning by dense predictive coding
The objective of this paper is self-supervised learning of spatio-temporal embeddings from
video, suitable for human action recognition. We make three contributions: First, we …
video, suitable for human action recognition. We make three contributions: First, we …
Slow down to go better: A survey on slow feature analysis
Temporal data contain a wealth of valuable information, playing an essential role in various
machine-learning tasks. Slow feature analysis (SFA), one of the most classic temporal …
machine-learning tasks. Slow feature analysis (SFA), one of the most classic temporal …
A review of predictive and contrastive self-supervised learning for medical images
WC Wang, E Ahn, D Feng, J Kim - Machine Intelligence Research, 2023 - Springer
Over the last decade, supervised deep learning on manually annotated big data has been
progressing significantly on computer vision tasks. But, the application of deep learning in …
progressing significantly on computer vision tasks. But, the application of deep learning in …
Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection
Video anomaly detection under weak labels is formulated as a typical multiple-instance
learning problem in previous works. In this paper, we provide a new perspective, ie, a …
learning problem in previous works. In this paper, we provide a new perspective, ie, a …
Segmenting objects from relational visual data
In this article, we model a set of pixelwise object segmentation tasks—automatic video
segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation …
segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation …