Chatgpt and open-ai models: A preliminary review
KI Roumeliotis, ND Tselikas - Future Internet, 2023 - mdpi.com
According to numerous reports, ChatGPT represents a significant breakthrough in the field of
artificial intelligence. ChatGPT is a pre-trained AI model designed to engage in natural …
artificial intelligence. ChatGPT is a pre-trained AI model designed to engage in natural …
Domain generalization: A survey
Generalization to out-of-distribution (OOD) data is a capability natural to humans yet
challenging for machines to reproduce. This is because most learning algorithms strongly …
challenging for machines to reproduce. This is because most learning algorithms strongly …
Dinov2: Learning robust visual features without supervision
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …
quantities of data have opened the way for similar foundation models in computer vision …
Laion-5b: An open large-scale dataset for training next generation image-text models
C Schuhmann, R Beaumont, R Vencu… - Advances in …, 2022 - proceedings.neurips.cc
Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of
training on large amounts of noisy image-text data, without relying on expensive accurate …
training on large amounts of noisy image-text data, without relying on expensive accurate …
A convnet for the 2020s
The" Roaring 20s" of visual recognition began with the introduction of Vision Transformers
(ViTs), which quickly superseded ConvNets as the state-of-the-art image classification …
(ViTs), which quickly superseded ConvNets as the state-of-the-art image classification …
Scaling vision transformers to 22 billion parameters
The scaling of Transformers has driven breakthrough capabilities for language models. At
present, the largest large language models (LLMs) contain upwards of 100B parameters …
present, the largest large language models (LLMs) contain upwards of 100B parameters …
Masked autoencoders are scalable vision learners
This paper shows that masked autoencoders (MAE) are scalable self-supervised learners
for computer vision. Our MAE approach is simple: we mask random patches of the input …
for computer vision. Our MAE approach is simple: we mask random patches of the input …
Coca: Contrastive captioners are image-text foundation models
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …
vision because these models can be quickly transferred to many downstream tasks. This …
Conditional prompt learning for vision-language models
With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential
to investigate ways to adapt these models to downstream datasets. A recently proposed …
to investigate ways to adapt these models to downstream datasets. A recently proposed …
Eva: Exploring the limits of masked visual representation learning at scale
We launch EVA, a vision-centric foundation model to explore the limits of visual
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …