Language-conditioned detection transformer
We present a new open-vocabulary detection framework. Our framework uses both image-
level labels and detailed detection annotations when available. Our framework proceeds in …
level labels and detailed detection annotations when available. Our framework proceeds in …
OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing
Multi-object multi-part scene segmentation is a challenging task whose complexity scales
exponentially with part granularity and number of scene objects. To address the task, we …
exponentially with part granularity and number of scene objects. To address the task, we …
CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models
Recent advances in Large Vision-Language Models (LVLMs) have sparked significant
progress in general-purpose vision tasks through visual instruction tuning. While some …
progress in general-purpose vision tasks through visual instruction tuning. While some …