A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

A comprehensive survey on test-time adaptation under distribution shifts

J Liang, R He, T Tan - International Journal of Computer Vision, 2025 - Springer
Abstract Machine learning methods strive to acquire a robust model during the training
process that can effectively generalize to test samples, even in the presence of distribution …

Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arxiv preprint arxiv …, 2023 - arxiv.org
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners

R Zhang, X Hu, B Li, S Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Visual recognition in low-data regimes requires deep neural networks to learn generalized
representations from limited training samples. Recently, CLIP-based methods have shown …

Diffusion art or digital forgery? investigating data replication in diffusion models

G Somepalli, V Singla, M Goldblum… - Proceedings of the …, 2023 - openaccess.thecvf.com
Cutting-edge diffusion models produce images with high quality and customizability,
enabling them to be used for commercial art and graphic design purposes. But do diffusion …

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Advances, challenges and opportunities in creating data for trustworthy AI

W Liang, GA Tadesse, D Ho, L Fei-Fei… - Nature Machine …, 2022 - nature.com
As artificial intelligence (AI) transitions from research to deployment, creating the appropriate
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …

Rethinking semantic segmentation: A prototype view

T Zhou, W Wang, E Konukoglu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Prevalent semantic segmentation solutions, despite their different network designs (FCN
based or attention based) and mask decoding strategies (parametric softmax based or pixel …

Masked feature prediction for self-supervised visual pre-training

C Wei, H Fan, S **e, CY Wu, A Yuille… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract We present Masked Feature Prediction (MaskFeat) for self-supervised pre-training
of video models. Our approach first randomly masks out a portion of the input sequence and …

Simmim: A simple framework for masked image modeling

Z **e, Z Zhang, Y Cao, Y Lin, J Bao… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper presents SimMIM, a simple framework for masked image modeling. We have
simplified recently proposed relevant approaches, without the need for special designs …