A comprehensive review of modern object segmentation approaches

Y Wang, U Ahsan, H Li, M Hagen - Foundations and Trends® …, 2022 - nowpublishers.com
Image segmentation is the task of associating pixels in an image with their respective object
class labels. It has a wide range of applications in many industries including healthcare …

Lisa: Reasoning segmentation via large language model

X Lai, Z Tian, Y Chen, Y Li, Y Yuan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Although perception systems have made remarkable advancements in recent years they still
rely on explicit human instruction or pre-defined categories to identify the target objects …

Open-vocabulary panoptic segmentation with text-to-image diffusion models

J Xu, S Liu, A Vahdat, W Byeon… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …

Learning to upsample by learning to sample

W Liu, H Lu, H Fu, Z Cao - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
We present DySample, an ultra-lightweight and effective dynamic upsampler. While
impressive performance gains have been witnessed from recent kernel-based dynamic …

Tracking anything with decoupled video segmentation

HK Cheng, SW Oh, B Price… - Proceedings of the …, 2023 - openaccess.thecvf.com
Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - IEEE transactions on …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Detclip: Dictionary-enriched visual-concept paralleled pre-training for open-world detection

L Yao, J Han, Y Wen, X Liang, D Xu… - Advances in …, 2022 - proceedings.neurips.cc
Open-world object detection, as a more general and challenging goal, aims to recognize
and localize objects described by arbitrary category names. The recent work GLIP …

Clusterfomer: clustering as a universal visual learner

J Liang, Y Cui, Q Wang, T Geng… - Advances in neural …, 2023 - proceedings.neurips.cc
This paper presents ClusterFormer, a universal vision model that is based on the Clustering
paradigm with TransFormer. It comprises two novel designs: 1) recurrent cross-attention …

FaPN: Feature-aligned pyramid network for dense image prediction

S Huang, Z Lu, R Cheng, C He - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Recent advancements in deep neural networks have made remarkable leap-forwards in
dense image prediction. However, the issue of feature alignment remains as neglected by …

Deformable feature aggregation for dynamic multi-modal 3D object detection

Z Chen, Z Li, S Zhang, L Fang, Q Jiang… - European conference on …, 2022 - Springer
Point clouds and RGB images are two general perceptional sources in autonomous driving.
The former can provide accurate localization of objects, and the latter is denser and richer in …