Brain-inspired remote sensing foundation models and open problems: A comprehensive survey

L Jiao, Z Huang, X Lu, X Liu, Y Yang… - IEEE Journal of …, 2023 - ieeexplore.ieee.org
The foundation model (FM) has garnered significant attention for its remarkable transfer
performance in downstream tasks. Typically, it undergoes task-agnostic pretraining on a …

Remote sensing image change captioning with dual-branch transformers: A new method and a large scale dataset

C Liu, R Zhao, H Chen, Z Zou… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Analyzing land cover changes with multitemporal remote sensing (RS) images is crucial for
environmental protection and land planning. In this article, we explore RS image change …

Rs-llava: A large vision-language model for joint captioning and question answering in remote sensing imagery

Y Bazi, L Bashmal, MM Al Rahhal, R Ricci, F Melgani - Remote Sensing, 2024 - mdpi.com
In this paper, we delve into the innovative application of large language models (LLMs) and
their extension, large vision-language models (LVLMs), in the field of remote sensing (RS) …

A decoupling paradigm with prompt learning for remote sensing image change captioning

C Liu, R Zhao, J Chen, Z Qi, Z Zou… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Remote sensing image change captioning (RSICC) is a novel task that aims to describe the
differences between bitemporal images by natural language. Previous methods ignore a …

Language Integration in Remote Sensing: Tasks, datasets, and future directions

L Bashmal, Y Bazi, F Melgani… - … and Remote Sensing …, 2023 - ieeexplore.ieee.org
The emerging field of vision–language models, which combines computer vision and natural
language processing (NLP), has gained significant interest and exploration. This integration …

A multiscale grou** transformer with clip latents for remote sensing image captioning

L Meng, J Wang, R Meng, Y Yang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recent progress has shown that integrating multiscale visual features with advanced
Transformer architectures is a promising approach for remote sensing image captioning …

HCNet: Hierarchical feature aggregation and cross-modal feature alignment for remote sensing image captioning

Z Yang, Q Li, Y Yuan, Q Wang - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Remote sensing image captioning (RSIC) aims to describe the crucial objects from remote
sensing images in the form of natural language. The inefficient utilization of object texture …

Prior knowledge-guided transformer for remote sensing image captioning

L Meng, J Wang, Y Yang, L **ao - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Remote sensing image (RSI) captioning aims to generate meaningful and grammatically
accurate sentences for RSIs. However, in comparison to natural image captioning, RSI …

Cooperative connection transformer for remote sensing image captioning

K Zhao, W **ong - IEEE Transactions on Geoscience and …, 2024 - ieeexplore.ieee.org
Feature extraction is fundamental for successful remote sensing image captioning (RSIC).
The representation forms of grid features and region features differ significantly. Grid …

[HTML][HTML] Exploring region features in remote sensing image captioning

K Zhao, W **ong - International Journal of Applied Earth Observation and …, 2024 - Elsevier
Remote sensing image captioning (RSIC), an emerging field of cross-modal tasks, has
become a popular research topic in recent years. Feature extraction underlies all RSIC …