Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe

H Li, C Sima, J Dai, W Wang, L Lu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional …

[PDF][PDF] YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems

CY Wang, HYM Liao - APSIPA Transactions on Signal and …, 2024 - nowpublishers.com
This is a comprehensive review of the YOLO series of systems. Different from previous
literature surveys, this review article reexamines the characteristics of the YOLO series from …

Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip

Q Yu, J He, X Deng, X Shen… - Advances in Neural …, 2023 - proceedings.neurips.cc
Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing
objects from an open set of categories in diverse environments. One way to address this …

Universal instance perception as object discovery and retrieval

B Yan, Y Jiang, J Wu, D Wang, P Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …

Images speak in images: A generalist painter for in-context visual learning

X Wang, W Wang, Y Cao, C Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com
In-context learning, as a new paradigm in NLP, allows the model to rapidly adapt to various
tasks with only a handful of prompts and examples. But in computer vision, the difficulties for …

Cut and learn for unsupervised object detection and instance segmentation

X Wang, R Girdhar, SX Yu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract We propose Cut-and-LEaRn (CutLER), a simple approach for training
unsupervised object detection and segmentation models. We leverage the property of self …

Seggpt: Segmenting everything in context

X Wang, X Zhang, Y Cao, W Wang, C Shen… - arxiv preprint arxiv …, 2023 - arxiv.org
We present SegGPT, a generalist model for segmenting everything in context. We unify
various segmentation tasks into a generalist in-context learning framework that …

RSPrompter: Learning to prompt for remote sensing instance segmentation based on visual foundation model

K Chen, C Liu, H Chen, H Zhang, W Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Leveraging the extensive training data from SA-1B, the segment anything model (SAM)
demonstrates remarkable generalization and zero-shot capabilities. However, as a category …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - IEEE transactions on …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …