Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Artificial intelligence for remote sensing data analysis: A review of challenges and opportunities

L Zhang, L Zhang - IEEE Geoscience and Remote Sensing …, 2022 - ieeexplore.ieee.org
Artificial intelligence (AI) plays a growing role in remote sensing (RS). Applications of AI,
particularly machine learning algorithms, range from initial image processing to high-level …

Scaling vision transformers to 22 billion parameters

M Dehghani, J Djolonga, B Mustafa… - International …, 2023 - proceedings.mlr.press
The scaling of Transformers has driven breakthrough capabilities for language models. At
present, the largest large language models (LLMs) contain upwards of 100B parameters …

Lamm: Language-assisted multi-modal instruction-tuning dataset, framework, and benchmark

Z Yin, J Wang, J Cao, Z Shi, D Liu… - Advances in …, 2024 - proceedings.neurips.cc
Large language models have emerged as a promising approach towards achieving general-
purpose AI agents. The thriving open-source LLM community has greatly accelerated the …

Scale-mae: A scale-aware masked autoencoder for multiscale geospatial representation learning

CJ Reed, R Gupta, S Li, S Brockman… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large, pretrained models are commonly finetuned with imagery that is heavily augmented to
mimic different conditions and scales, with the resulting models used for various tasks with …

A comprehensive review on deep learning based remote sensing image super-resolution methods

P Wang, B Bayram, E Sertel - Earth-Science Reviews, 2022 - Elsevier
Satellite imageries are an important geoinformation source for different applications in the
Earth Science field. However, due to the limitation of the optic and sensor technologies and …

Remote sensing scene classification via multi-stage self-guided separation network

J Wang, W Li, M Zhang, R Tao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
In recent years, remote-sensing scene classification is one of the research hotspots and has
played an important role in the field of intelligent interpretation of remote-sensing data …

Advancing plain vision transformer toward remote sensing foundation model

D Wang, Q Zhang, Y Xu, J Zhang, B Du… - … on Geoscience and …, 2022 - ieeexplore.ieee.org
Large-scale vision foundation models have made significant progress in visual tasks on
natural images, with vision transformers (ViTs) being the primary choice due to their good …

Transformers in remote sensing: A survey

AA Aleissaee, A Kumar, RM Anwer, S Khan… - Remote Sensing, 2023 - mdpi.com
Deep learning-based algorithms have seen a massive popularity in different areas of remote
sensing image analysis over the past decade. Recently, transformer-based architectures …

RingMo: A remote sensing foundation model with masked image modeling

X Sun, P Wang, W Lu, Z Zhu, X Lu, Q He… - … on Geoscience and …, 2022 - ieeexplore.ieee.org
Deep learning approaches have contributed to the rapid development of remote sensing
(RS) image interpretation. The most widely used training paradigm is to use ImageNet …