- Academic Search

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Salva Cita Citato da 609 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] arxiv.org

Transformers in vision: A survey

S Khan, M Naseer, M Hayat, SW Zamir… - ACM computing …, 2022 - dl.acm.org

Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …

Salva Cita Citato da 2920 Articoli correlati Tutte e 8 le versioni

[Free GPT-4]

[PDF] nowpublishers.com

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Salva Cita Citato da 214 Articoli correlati Tutte e 6 le versioni Ricerca biblioteche Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

Salva Cita Citato da 1124 Articoli correlati Tutte e 10 le versioni Versione HTML

[Free GPT-4]

[PDF] parisdescartes.fr

Transformer-based unsupervised contrastive learning for histopathological image classification

X Wang, S Yang, J Zhang, M Wang, J Zhang… - Medical image …, 2022 - Elsevier

A large-scale and well-annotated dataset is a key factor for the success of deep learning in
medical image analysis. However, assembling such large annotations is very challenging …

Salva Cita Citato da 379 Articoli correlati Tutte e 4 le versioni

[Free GPT-4]

[PDF] thecvf.com

Point-bert: Pre-training 3d point cloud transformers with masked point modeling

X Yu, L Tang, Y Rao, T Huang… - Proceedings of the …, 2022 - openaccess.thecvf.com

We present Point-BERT, a novel paradigm for learning Transformers to generalize the
concept of BERT onto 3D point cloud. Following BERT, we devise a Masked Point Modeling …

Salva Cita Citato da 732 Articoli correlati Tutte e 6 le versioni Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Self-supervised pre-training of swin transformers for 3d medical image analysis

Y Tang, D Yang, W Li, HR Roth… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract Vision Transformers (ViT) s have shown great performance in self-supervised
learning of global and local representations that can be transferred to downstream …

Salva Cita Citato da 725 Articoli correlati Tutte e 6 le versioni Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Denseclip: Language-guided dense prediction with context-aware prompting

Y Rao, W Zhao, G Chen, Y Tang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recent progress has shown that large-scale pre-training using contrastive image-text pairs
can be a promising alternative for high-quality visual representation learning from natural …

Salva Cita Citato da 618 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

ibot: Image bert pre-training with online tokenizer

J Zhou, C Wei, H Wang, W Shen, C **e, A Yuille… - arxiv preprint arxiv …, 2021 - arxiv.org

The success of language Transformers is primarily attributed to the pretext task of masked
language modeling (MLM), where texts are first tokenized into semantically meaningful …

Salva Cita Citato da 938 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] openreview.net

Beit: Bert pre-training of image transformers

H Bao, L Dong, S Piao, F Wei - arxiv preprint arxiv:2106.08254, 2021 - arxiv.org

We introduce a self-supervised vision representation model BEiT, which stands for
Bidirectional Encoder representation from Image Transformers. Following BERT developed …

Salva Cita Citato da 3112 Articoli correlati Tutte e 3 le versioni Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Self-supervised learning with swin transformers

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

Transformers in vision: A survey

Multimodal foundation models: From specialists to general-purpose assistants

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

Transformer-based unsupervised contrastive learning for histopathological image classification

Point-bert: Pre-training 3d point cloud transformers with masked point modeling

Self-supervised pre-training of swin transformers for 3d medical image analysis

Denseclip: Language-guided dense prediction with context-aware prompting

ibot: Image bert pre-training with online tokenizer

Beit: Bert pre-training of image transformers