- Academic Search

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org

Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Uložit Citovat Počet citací tohoto článku: 136 Související články Všechny verze (počet: 2)

[Free GPT-4]

[PDF] nowpublishers.com

Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

Uložit Citovat Počet citací tohoto článku: 197 Související články Všechny verze (počet: 7) Hledat knihovnu Zobrazit jako HTML

[Free GPT-4]

[PDF] neurips.cc

Segment everything everywhere all at once

X Zou, J Yang, H Zhang, F Li, L Li… - Advances in …, 2024 - proceedings.neurips.cc

In this work, we present SEEM, a promotable and interactive model for segmenting
everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …

Uložit Citovat Počet citací tohoto článku: 530 Související články Všechny verze (počet: 5) Zobrazit jako HTML

[Free GPT-4]

[PDF] thecvf.com

Open-vocabulary panoptic segmentation with text-to-image diffusion models

J Xu, S Liu, A Vahdat, W Byeon… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …

Uložit Citovat Počet citací tohoto článku: 429 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]

[PDF] neurips.cc

Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip

Q Yu, J He, X Deng, X Shen… - Advances in Neural …, 2023 - proceedings.neurips.cc

Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing
objects from an open set of categories in diverse environments. One way to address this …

Uložit Citovat Počet citací tohoto článku: 127 Související články Všechny verze (počet: 5) Zobrazit jako HTML

[Free GPT-4]

[PDF] thecvf.com

Side adapter network for open-vocabulary semantic segmentation

M Xu, Z Zhang, F Wei, H Hu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

This paper presents a new framework for open-vocabulary semantic segmentation with the
pre-trained vision-language model, named SAN. Our approach models the semantic …

Uložit Citovat Počet citací tohoto článku: 271 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]

[PDF] arxiv.org

Segment and Recognize Anything at Any Granularity

F Li, H Zhang, P Sun, X Zou, S Liu, C Li, J Yang… - … on Computer Vision, 2024 - Springer

In this work, we introduce Semantic-SAM, an augmented image segmentation foundation for
segmenting and recognizing anything at desired granularities. Compared to the …

Uložit Citovat Počet citací tohoto článku: 167 Související články Všechny verze (počet: 2)

[Free GPT-4]

[PDF] arxiv.org

Vision-language models for vision tasks: A survey

J Zhang, J Huang, S **, S Lu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …

Uložit Citovat Počet citací tohoto článku: 427 Související články Všechny verze (počet: 9)

[Free GPT-4]

[PDF] thecvf.com

Generalized decoding for pixel, image, and language

X Zou, ZY Dou, J Yang, Z Gan, L Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present X-Decoder, a generalized decoding model that can predict pixel-level
segmentation and language tokens seamlessly. X-Decoder takes as input two types of …

Uložit Citovat Počet citací tohoto článku: 252 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]

[PDF] thecvf.com

Open-vocabulary semantic segmentation with mask-adapted clip

F Liang, B Wu, X Dai, K Li, Y Zhao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Open-vocabulary semantic segmentation aims to segment an image into semantic regions
according to text descriptions, which may not have been seen during training. Recent two …

Uložit Citovat Počet citací tohoto článku: 467 Související články Všechny verze (počet: 9) Zobrazit jako HTML

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Groupvit: Semantic segmentation emerges from text supervision

Foundation Models Defining a New Era in Vision: a Survey and Outlook

Vision-language pre-training: Basics, recent advances, and future trends

Segment everything everywhere all at once

Open-vocabulary panoptic segmentation with text-to-image diffusion models

Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip

Side adapter network for open-vocabulary semantic segmentation

Segment and Recognize Anything at Any Granularity

Vision-language models for vision tasks: A survey

Generalized decoding for pixel, image, and language

Open-vocabulary semantic segmentation with mask-adapted clip