- Academic Search

M Chiquier, U Mall, C Vondrick - European Conference on Computer …, 2024 - Springer

Multimodal pre-trained models, such as CLIP, are popular for zero-shot classification due to
their open-vocabulary flexibility and high performance. However, vision-language models …

Speichern Zitieren Zitiert von: 9 Ähnliche Artikel Alle 2 Versionen

Copt: Unsupervised domain adaptive segmentation using domain-agnostic text embeddings

C Mata, K Ranasinghe, MS Ryoo - European Conference on Computer …, 2024 - Springer

Unsupervised domain adaptation (UDA) involves learning class semantics from labeled
data within a source domain that generalize to an unseen target domain. UDA methods are …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models

T Pulli, S Thalhammer, S Schwaiger… - arxiv preprint arxiv …, 2024 - arxiv.org

Robots are increasingly envisioned to interact in real-world scenarios, where they must
continuously adapt to new situations. To detect and grasp novel objects, zero-shot pose …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Open-Vocabulary Object Detection via Neighboring Region Attention Alignment

S Qiang, X Li, Y Liang, W Liao, T He, P Peng - arxiv preprint arxiv …, 2024 - arxiv.org

The nature of diversity in real-world environments necessitates neural network models to
expand from closed category settings to accommodate novel emerging categories. In this …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection

Q Lei, B Wang, RT Tan - arxiv preprint arxiv:2410.23904, 2024 - arxiv.org

Detecting Human-Object Interactions (HOI) in zero-shot settings, where models must handle
unseen classes, poses significant challenges. Existing methods that rely on aligning visual …

Speichern Zitieren Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

SSPA: Split-and-Synthesize Prompting with Gated Alignments for Multi-Label Image Recognition

H Tan, Z Tan, J Li, J Wan, Z Lei, SZ Li - arxiv preprint arxiv:2407.20920, 2024 - arxiv.org

Multi-label image recognition is a fundamental task in computer vision. Recently, Vision-
Language Models (VLMs) have made notable advancements in this area. However …

Speichern Zitieren Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sampling Bag of Views for Open-Vocabulary Object Detection

H Choi, J Choe, H Shim - arxiv preprint arxiv:2412.18273, 2024 - arxiv.org

Existing open-vocabulary object detection (OVD) develops methods for testing unseen
categories by aligning object region embeddings with corresponding VLM features. A recent …

Speichern Zitieren Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models

CC Sartori, C Blum, F Bistaffa - IEEE Access, 2025 - ieeexplore.ieee.org

The fast advancement of Large Vision-Language Models (LVLMs) has shown immense
potential. These models are increasingly capable of tackling abstract visual tasks. Geometric …

Speichern Zitieren Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Test-Time Optimization for Domain Adaptive Open Vocabulary Segmentation

U De Silva, D Samaraweera, S Wanigathunga… - arxiv preprint arxiv …, 2025 - arxiv.org

We present Seg-TTO, a novel framework for zero-shot, open-vocabulary semantic
segmentation (OVSS), designed to excel in specialized domain tasks. While current open …

Speichern Zitieren Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation

Y Zheng, K Liu - arxiv preprint arxiv:2404.08603, 2024 - arxiv.org

Open-vocabulary object detection (OVOD) aims at localizing and recognizing visual objects
from novel classes unseen at the training time. Whereas, empirical studies reveal that …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 2 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Llms meet vlms: Boost open vocabulary object detection with fine-grained descriptors

Evolving interpretable visual classifiers with large language models

Copt: Unsupervised domain adaptive segmentation using domain-agnostic text embeddings

From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models

Open-Vocabulary Object Detection via Neighboring Region Attention Alignment

EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection

SSPA: Split-and-Synthesize Prompting with Gated Alignments for Multi-Label Image Recognition

Sampling Bag of Views for Open-Vocabulary Object Detection

VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models

Test-Time Optimization for Domain Adaptive Open Vocabulary Segmentation

Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation