Google Akademik

Kaydet Alıntı yap Alıntılanma sayısı: 2250 İlgili makaleler 11 sürümün hepsi HTML olarak görüntüle

Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arxiv preprint arxiv …, 2023 - arxiv.org

The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

Kaydet Alıntı yap Alıntılanma sayısı: 153 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

Clip2scene: Towards label-efficient 3d scene understanding by clip

R Chen, Y Liu, L Kong, X Zhu, Y Ma… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Contrastive Language-Image Pre-training (CLIP) achieves promising results in 2D
zero-shot and few-shot learning. Despite the impressive performance in 2D, applying CLIP …

Kaydet Alıntı yap Alıntılanma sayısı: 736 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

Masked feature prediction for self-supervised visual pre-training

C Wei, H Fan, S **e, CY Wu, A Yuille… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract We present Masked Feature Prediction (MaskFeat) for self-supervised pre-training
of video models. Our approach first randomly masks out a portion of the input sequence and …

Kaydet Alıntı yap Alıntılanma sayısı: 1489 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

Simmim: A simple framework for masked image modeling

Z **e, Z Zhang, Y Cao, Y Lin, J Bao… - Proceedings of the …, 2022 - openaccess.thecvf.com

This paper presents SimMIM, a simple framework for masked image modeling. We have
simplified recently proposed relevant approaches, without the need for special designs …

Kaydet Alıntı yap Alıntılanma sayısı: 358 İlgili makaleler 5 sürümün hepsi

Masked siamese networks for label-efficient learning

M Assran, M Caron, I Misra, P Bojanowski… - … on Computer Vision, 2022 - Springer

Abstract We propose Masked Siamese Networks (MSN), a self-supervised learning
framework for learning image representations. Our approach matches the representation of …

Kaydet Alıntı yap Alıntılanma sayısı: 403 İlgili makaleler 5 sürümün hepsi

Context autoencoder for self-supervised representation learning

X Chen, M Ding, X Wang, Y **n, S Mo, Y Wang… - International Journal of …, 2024 - Springer

We present a novel masked image modeling (MIM) approach, context autoencoder (CAE),
for self-supervised representation pretraining. We pretrain an encoder by making predictions …

Kaydet Alıntı yap Alıntılanma sayısı: 553 İlgili makaleler 5 sürümün hepsi

Extract free dense labels from clip

C Zhou, CC Loy, B Dai - European Conference on Computer Vision, 2022 - Springer

Abstract Contrastive Language-Image Pre-training (CLIP) has made a remarkable
breakthrough in open-vocabulary zero-shot image recognition. Many recent studies …