OMG-Seg: Is one model good enough for all segmentation?

X Li, H Yuan, W Li, H Ding, S Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …

Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …

Reference twice: A simple and unified baseline for few-shot instance segmentation

Y Han, J Zhang, Y Wang, C Wang, Y Liu… - IEEE transactions on …, 2024 - ieeexplore.ieee.org
Few-Shot Instance Segmentation (FSIS) requires detecting and segmenting novel classes
with limited support examples. Existing methods based on Region Proposal Networks …

Clim: Contrastive language-image mosaic for region representation

S Wu, W Zhang, L Xu, S **, W Liu… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Detecting objects accurately from a large or open vocabulary necessitates the vision-
language alignment on region representations. However, learning such a region-text …

[HTML][HTML] Ov-vg: A benchmark for open-vocabulary visual grounding

C Wang, W Feng, X Li, G Cheng, S Lyu, B Liu, L Chen… - Neurocomputing, 2024 - Elsevier
Open-vocabulary learning has emerged as a cutting-edge research area, particularly in light
of the widespread adoption of vision-based foundational models. Its primary objective is to …

Rethinking evaluation metrics of open-vocabulary segmentaion

H Zhou, T Shen, X Yang, H Huang, X Li, L Qi… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we highlight a problem of evaluation metrics adopted in the open-vocabulary
segmentation. That is, the evaluation process still heavily relies on closed-set metrics on …

Ov-dquo: Open-vocabulary detr with denoising text query training and open-world unknown objects supervision

J Wang, B Chen, B Kang, Y Li, YC Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Open-vocabulary detection aims to detect objects from novel categories beyond the base
categories on which the detector is trained. However, existing open-vocabulary detectors …

VG4D: Vision-Language Model Goes 4D Video Recognition

Z Deng, X Li, X Li, Y Tong, S Zhao, M Liu - arxiv preprint arxiv:2404.11605, 2024 - arxiv.org
Understanding the real world through point cloud video is a crucial aspect of robotics and
autonomous driving systems. However, prevailing methods for 4D point cloud recognition …

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection

H Wang, Q He, J Peng, H Yang, M Chi… - arxiv preprint arxiv …, 2024 - arxiv.org
Open-vocabulary detection (OVD) aims to detect objects beyond a predefined set of
categories. As a pioneering model incorporating the YOLO series into OVD, YOLO-World is …

Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection

Y Chen, W Yao, L Meng, S Wu, Z Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Enabling models to recognize vast open-world categories has been a longstanding pursuit
in object detection. By leveraging the generalization capabilities of vision-language models …