Remoteclip: A vision language foundation model for remote sensing

F Liu, D Chen, Z Guan, X Zhou, J Zhu… - … on Geoscience and …, 2024 - ieeexplore.ieee.org
General-purpose foundation models have led to recent breakthroughs in artificial
intelligence (AI). In remote sensing, self-supervised learning (SSL) and masked image …

A comprehensive survey on image captioning: from handcrafted to deep learning-based techniques, a taxonomy and open research issues

H Sharma, D Padha - Artificial Intelligence Review, 2023 - Springer
Image captioning is a pretty modern area of the convergence of computer vision and natural
language processing and is widely used in a range of applications such as multi-modal …

Exploring a fine-grained multiscale method for cross-modal remote sensing image retrieval

Z Yuan, W Zhang, K Fu, X Li, C Deng, H Wang… - arxiv preprint arxiv …, 2022 - arxiv.org
Remote sensing (RS) cross-modal text-image retrieval has attracted extensive attention for
its advantages of flexible input and efficient query. However, traditional methods ignore the …

A deep semantic alignment network for the cross-modal image-text retrieval in remote sensing

Q Cheng, Y Zhou, P Fu, Y Xu… - IEEE Journal of Selected …, 2021 - ieeexplore.ieee.org
Because of the rapid growth of multimodal data from the internet and social media, a cross-
modal retrieval has become an important and valuable task in recent years. The purpose of …

Parameter-efficient transfer learning for remote sensing image-text retrieval

Y Yuan, Y Zhan, Z **ong - IEEE Transactions on Geoscience …, 2023 - ieeexplore.ieee.org
Vision-and-language pretraining (VLP) models have experienced a surge in popularity
recently. By fine-tuning them on specific datasets, significant performance improvements …

Language Integration in Remote Sensing: Tasks, datasets, and future directions

L Bashmal, Y Bazi, F Melgani… - … and Remote Sensing …, 2023 - ieeexplore.ieee.org
The emerging field of vision–language models, which combines computer vision and natural
language processing (NLP), has gained significant interest and exploration. This integration …

A novel SVM-based decoder for remote sensing image captioning

G Hoxha, F Melgani - IEEE Transactions on Geoscience and …, 2021 - ieeexplore.ieee.org
Most of the remote sensing image captioning (IC) models are based on encoder–decoder
frameworks where a convolutional neural network (CNN) encodes the image information …

Change captioning: A new paradigm for multitemporal remote sensing image analysis

G Hoxha, S Chouaf, F Melgani… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Change detection (CD) is among the most important applications in remote sensing (RS)
that allows identifying the changes that occurred in a given geographical area across …

SD-RSIC: Summarization-driven deep remote sensing image captioning

G Sumbul, S Nayak, B Demir - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Deep neural networks (DNNs) have been recently found popular for image captioning
problems in remote sensing (RS). Existing DNN-based approaches rely on the availability of …

Multilanguage transformer for improved text to remote sensing image retrieval

MM Al Rahhal, Y Bazi, NA Alsharif… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Cross-modal text-image retrieval in remote sensing (RS) provides a flexible retrieval
experience for mining useful information from RS repositories. However, existing methods …