A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
Contrastive representation learning: A framework and review
Contrastive Learning has recently received interest due to its success in self-supervised
representation learning in the computer vision domain. However, the origins of Contrastive …
representation learning in the computer vision domain. However, the origins of Contrastive …
Masked autoencoders are scalable vision learners
This paper shows that masked autoencoders (MAE) are scalable self-supervised learners
for computer vision. Our MAE approach is simple: we mask random patches of the input …
for computer vision. Our MAE approach is simple: we mask random patches of the input …
Videomae: Masked autoencoders are data-efficient learners for self-supervised video pre-training
Pre-training video transformers on extra large-scale datasets is generally required to
achieve premier performance on relatively small datasets. In this paper, we show that video …
achieve premier performance on relatively small datasets. In this paper, we show that video …
Masked autoencoders as spatiotemporal learners
This paper studies a conceptually simple extension of Masked Autoencoders (MAE) to
spatiotemporal representation learning from videos. We randomly mask out spacetime …
spatiotemporal representation learning from videos. We randomly mask out spacetime …
Momentum contrast for unsupervised visual representation learning
Abstract We present Momentum Contrast (MoCo) for unsupervised visual representation
learning. From a perspective on contrastive learning as dictionary look-up, we build a …
learning. From a perspective on contrastive learning as dictionary look-up, we build a …
Unsupervised learning of visual features by contrasting cluster assignments
Unsupervised image representations have significantly reduced the gap with supervised
pretraining, notably with the recent achievements of contrastive learning methods. These …
pretraining, notably with the recent achievements of contrastive learning methods. These …
[PDF][PDF] Electra: Pre-training text encoders as discriminators rather than generators
K Clark - arxiv preprint arxiv:2003.10555, 2020 - academia.edu
Masked language modeling (MLM) pre-training methods such as BERT corrupt the input by
replacing some tokens with [MASK] and then train a model to reconstruct the original tokens …
replacing some tokens with [MASK] and then train a model to reconstruct the original tokens …
Representation learning with contrastive predictive coding
While supervised learning has enabled great progress in many applications, unsupervised
learning has not seen such widespread adoption, and remains an important and …
learning has not seen such widespread adoption, and remains an important and …
Contrastive multiview coding
Humans view the world through many sensory channels, eg, the long-wavelength light
channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right …
channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right …