- Academic Search

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Uložit Citovat Počet citací tohoto článku: 610 Související články Všechny verze (počet: 2)

[Free GPT-4]

[PDF] arxiv.org

A review of location encoding for GeoAI: methods and applications

G Mai, K Janowicz, Y Hu, S Gao, B Yan… - International Journal …, 2022 - Taylor & Francis

ABSTRACT A common need for artificial intelligence models in the broader geoscience is to
encode various types of spatial data, such as points, polylines, polygons, graphs, or rasters …

Uložit Citovat Počet citací tohoto článku: 136 Související články Všechny verze (počet: 7)

[Free GPT-4]

[PDF] arxiv.org

Vision-language models for vision tasks: A survey

J Zhang, J Huang, S **, S Lu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …

Uložit Citovat Počet citací tohoto článku: 428 Související články Všechny verze (počet: 9)

[Free GPT-4]

[PDF] arxiv.org

Eva-02: A visual representation for neon genesis

Y Fang, Q Sun, X Wang, T Huang, X Wang… - Image and Vision …, 2024 - Elsevier

We launch EVA-02, a next-generation Transformer-based visual representation pre-trained
to reconstruct strong and robust language-aligned vision features via masked image …

Uložit Citovat Počet citací tohoto článku: 227 Související články Všechny verze (počet: 3)

[Free GPT-4]

[PDF] thecvf.com

Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks

Z Chen, J Wu, W Wang, W Su, G Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com

The exponential growth of large language models (LLMs) has opened up numerous
possibilities for multi-modal AGI systems. However the progress in vision and vision …

Uložit Citovat Počet citací tohoto článku: 172 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]

[PDF] thecvf.com

What does a platypus look like? generating customized prompts for zero-shot image classification

S Pratt, I Covert, R Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Open-vocabulary models are a promising new paradigm for image classification. Unlike
traditional classification models, open-vocabulary models classify among any arbitrary set of …

Uložit Citovat Počet citací tohoto článku: 239 Související články Všechny verze (počet: 7) Zobrazit jako HTML

Florence: A new foundation model for computer vision

L Yuan, D Chen, YL Chen, N Codella, X Dai… - arxiv preprint arxiv …, 2021 - arxiv.org

Automated visual understanding of our diverse and open world demands computer vision
models to generalize well with minimal customization for specific tasks, similar to human …

Uložit Citovat Počet citací tohoto článku: 961 Související články Všechny verze (počet: 2) Zobrazit jako HTML

[Free GPT-4]

[PDF] arxiv.org

Is synthetic data from generative models ready for image recognition?

R He, S Sun, X Yu, C Xue, W Zhang, P Torr… - arxiv preprint arxiv …, 2022 - arxiv.org

Recent text-to-image generation models have shown promising results in generating high-
fidelity photo-realistic images. Though the results are astonishing to human eyes, how …

Uložit Citovat Počet citací tohoto článku: 283 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]

[PDF] thecvf.com

With a little help from my friends: Nearest-neighbor contrastive learning of visual representations

D Dwibedi, Y Aytar, J Tompson… - Proceedings of the …, 2021 - openaccess.thecvf.com

Self-supervised learning algorithms based on instance discrimination train encoders to be
invariant to pre-defined transformations of the same instance. While most methods treat …

Uložit Citovat Počet citací tohoto článku: 544 Související články Všechny verze (počet: 5) Zobrazit jako HTML

[Free GPT-4]

[PDF] arxiv.org

Fine-grained image analysis with deep learning: A survey

XS Wei, YZ Song, O Mac Aodha, J Wu… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer
vision and pattern recognition, and underpins a diverse set of real-world applications. The …

Uložit Citovat Počet citací tohoto článku: 338 Související články Všechny verze (počet: 8)

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Birdsnap: Large-scale fine-grained visual categorization of birds

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

A review of location encoding for GeoAI: methods and applications

Vision-language models for vision tasks: A survey

Eva-02: A visual representation for neon genesis

Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks

What does a platypus look like? generating customized prompts for zero-shot image classification

Florence: A new foundation model for computer vision

Is synthetic data from generative models ready for image recognition?

With a little help from my friends: Nearest-neighbor contrastive learning of visual representations

Fine-grained image analysis with deep learning: A survey