Semantic segmentation using Vision Transformers: A survey

H Thisanke, C Deshan, K Chamith… - … Applications of Artificial …, 2023 - Elsevier
Semantic segmentation has a broad range of applications in a variety of domains including
land coverage analysis, autonomous driving, and medical image analysis. Convolutional …

Remote sensing object detection in the deep learning era—a review

S Gui, S Song, R Qin, Y Tang - Remote Sensing, 2024 - mdpi.com
Given the large volume of remote sensing images collected daily, automatic object detection
and segmentation have been a consistent need in Earth observation (EO). However, objects …

Segment anything

A Kirillov, E Mintun, N Ravi, H Mao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for
image segmentation. Using our efficient model in a data collection loop, we built the largest …

Lisa: Reasoning segmentation via large language model

X Lai, Z Tian, Y Chen, Y Li, Y Yuan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Although perception systems have made remarkable advancements in recent years they still
rely on explicit human instruction or pre-defined categories to identify the target objects …

Open-vocabulary panoptic segmentation with text-to-image diffusion models

J Xu, S Liu, A Vahdat, W Byeon… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …

Visual chatgpt: Talking, drawing and editing with visual foundation models

C Wu, S Yin, W Qi, X Wang, Z Tang, N Duan - arxiv preprint arxiv …, 2023 - arxiv.org
ChatGPT is attracting a cross-field interest as it provides a language interface with
remarkable conversational competency and reasoning capabilities across many domains …

Planning-oriented autonomous driving

Y Hu, J Yang, L Chen, K Li, C Sima… - Proceedings of the …, 2023 - openaccess.thecvf.com
Modern autonomous driving system is characterized as modular tasks in sequential order,
ie, perception, prediction, and planning. In order to perform a wide diversity of tasks and …

Flatten transformer: Vision transformer using focused linear attention

D Han, X Pan, Y Han, S Song… - Proceedings of the …, 2023 - openaccess.thecvf.com
The quadratic computation complexity of self-attention has been a persistent challenge
when applying Transformer models to vision tasks. Linear attention, on the other hand, offers …

Side adapter network for open-vocabulary semantic segmentation

M Xu, Z Zhang, F Wei, H Hu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
This paper presents a new framework for open-vocabulary semantic segmentation with the
pre-trained vision-language model, named SAN. Our approach models the semantic …

Symphonize 3d semantic scene completion with contextual instance queries

H Jiang, T Cheng, N Gao, H Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract 3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal
undertaking in autonomous driving aiming to predict the voxel occupancy within volumetric …