Universal instance perception as object discovery and retrieval

B Yan, Y Jiang, J Wu, D Wang, P Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …

Visual semantic segmentation based on few/zero-shot learning: An overview

W Ren, Y Tang, Q Sun, C Zhao… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org
Visual semantic segmentation aims at separating a visual sample into diverse blocks with
specific semantic attributes and identifying the category for each block, and it plays a crucial …

Onlinerefer: A simple online baseline for referring video object segmentation

D Wu, T Wang, Y Zhang, X Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Referring video object segmentation (RVOS) aims at segmenting an object in a video
following human instruction. Current state-of-the-art methods fall into an offline pattern, in …

Spectrum-guided multi-granularity referring video object segmentation

B Miao, M Bennamoun, Y Gao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Current referring video object segmentation (R-VOS) techniques extract conditional kernels
from encoded (low-resolution) vision-language features to segment the decoded high …

A comprehensive survey on video saliency detection with auditory information: the audio-visual consistency perceptual is the key!

C Chen, M Song, W Song, L Guo… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Video saliency detection (VSD) aims at fast locating the most attractive
objects/things/patterns in a given video clip. Existing VSD-related works have mainly relied …

Robust referring video object segmentation with cyclic structural consensus

X Li, J Wang, X Xu, X Li, B Raj… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract Referring Video Object Segmentation (R-VOS) is a challenging task that aims to
segment an object in a video based on a linguistic expression. Most existing R-VOS …

Local-global context aware transformer for language-guided video segmentation

C Liang, W Wang, T Zhou, J Miao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
We explore the task of language-guided video segmentation (LVS). Previous algorithms
mostly adopt 3D CNNs to learn video representation, struggling to capture long-term context …

Segment every reference object in spatial and temporal spaces

J Wu, Y Jiang, B Yan, H Lu… - Proceedings of the …, 2023 - openaccess.thecvf.com
The reference-based object segmentation tasks, namely referring image segmentation
(RIS), referring video object segmentation (RVOS), and video object segmentation (VOS) …

Self-supervised pretraining for RGB-D salient object detection

X Zhao, Y Pang, L Zhang, H Lu, X Ruan - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Abstract Existing CNNs-Based RGB-D salient object detection (SOD) networks are all
required to be pretrained on the ImageNet to learn the hierarchy features which helps …

Decoupling static and hierarchical motion perception for referring video segmentation

S He, H Ding - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Referring video segmentation relies on natural language expressions to identify and
segment objects often emphasizing motion clues. Previous works treat a sentence as a …