- Academic Search

A Yu, Y Quan, R Yu, W Guo, X Wang, D Hong… - Remote Sensing, 2023 - mdpi.com

The annotations used during the training process are crucial for the inference results of
remote sensing images (RSIs) based on a deep learning framework. Unlabeled RSIs can be …

保存引用被引用数: 12 関連記事全 4 バージョンキャッシュ

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Fine-grained late-interaction multi-modal retrieval for retrieval augmented visual question answering

W Lin, J Chen, J Mei, A Coca… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Knowledge-based Visual Question Answering (KB-VQA) requires VQA systems to
utilize knowledge from external knowledge bases to answer visually-grounded questions …

保存引用被引用数: 41 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] Poseidon: A data augmentation tool for small object detection datasets in maritime environments

P Ruiz-Ponce, D Ortiz-Perez, J Garcia-Rodriguez… - Sensors, 2023 - mdpi.com

Certain fields present significant challenges when attempting to train complex Deep
Learning architectures, particularly when the available datasets are limited and imbalanced …

保存引用被引用数: 46 関連記事全 11 バージョンキャッシュ

[Free GPT-4]
[DeepSeek]

[PDF] chemrxiv.org

Coati: Multimodal contrastive pretraining for representing and traversing chemical space

B Kaufman, EC Williams, C Underkoffler… - Journal of Chemical …, 2024 - ACS Publications

Creating a successful small molecule drug is a challenging multiparameter optimization
problem in an effectively infinite space of possible molecules. Generative models have …

保存引用被引用数: 14 関連記事全 5 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Vision–language model for visual question answering in medical imagery

Y Bazi, MMA Rahhal, L Bashmal, M Zuair - Bioengineering, 2023 - mdpi.com

In the clinical and healthcare domains, medical images play a critical role. A mature medical
visual question answering system (VQA) can improve diagnosis by answering clinical …

保存引用被引用数: 62 関連記事全 9 バージョンキャッシュ

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] Vision Transformers for Image Classification: A Comparative Survey

Y Wang, Y Deng, Y Zheng, P Chattopadhyay, L Wang - Technologies, 2025 - mdpi.com

Transformers were initially introduced for natural language processing, leveraging the self-
attention mechanism. They require minimal inductive biases in their design and can function …

保存引用被引用数: 1 関連記事全 3 バージョンキャッシュ

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Machine-to-machine visual dialoguing with ChatGPT for enriched textual image description

R Ricci, Y Bazi, F Melgani - Remote Sensing, 2024 - mdpi.com

Image captioning is a technique that enables the automatic extraction of natural language
descriptions about the contents of an image. On the one hand, information in the form of …

保存引用被引用数: 8 関連記事全 4 バージョンキャッシュ

[Free GPT-4]
[DeepSeek]

[PDF] rsc.org

Domain-specific chatbots for science using embeddings

KG Yager - Digital Discovery, 2023 - pubs.rsc.org

Large language models (LLMs) have emerged as powerful machine-learning systems
capable of handling a myriad of tasks. Tuned versions of these systems have been turned …

保存引用被引用数: 17 関連記事全 5 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Summarization of videos with the signature transform

J de Curtò, I de Zarzà, G Roig, CT Calafate - Electronics, 2023 - mdpi.com

This manuscript presents a new benchmark for assessing the quality of visual summaries
without the need for human annotators. It is based on the Signature Transform, specifically …

保存引用被引用数: 11 関連記事全 8 バージョンキャッシュ

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

A Fusion Encoder with Multi-Task Guidance for Cross-Modal Text–Image Retrieval in Remote Sensing

X Zhang, W Li, X Wang, L Wang, F Zheng, L Wang… - Remote Sensing, 2023 - mdpi.com

In recent years, there has been a growing interest in remote sensing image–text cross-
modal retrieval due to the rapid development of space information technology and the …

保存引用被引用数: 8 関連記事全 5 バージョンキャッシュ

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Learning transferable visual models from natural language supervision. arxiv 2021

Deep learning methods for semantic segmentation in remote sensing with small data: A survey

Fine-grained late-interaction multi-modal retrieval for retrieval augmented visual question answering

[HTML][HTML] Poseidon: A data augmentation tool for small object detection datasets in maritime environments

Coati: Multimodal contrastive pretraining for representing and traversing chemical space

Vision–language model for visual question answering in medical imagery

[HTML][HTML] Vision Transformers for Image Classification: A Comparative Survey

Machine-to-machine visual dialoguing with ChatGPT for enriched textual image description

Domain-specific chatbots for science using embeddings

Summarization of videos with the signature transform

A Fusion Encoder with Multi-Task Guidance for Cross-Modal Text–Image Retrieval in Remote Sensing