Deep learning for video object segmentation: a review

M Gao, F Zheng, JJQ Yu, C Shan, G Ding… - Artificial Intelligence …, 2023 - Springer
As one of the fundamental problems in the field of video understanding, video object
segmentation aims at segmenting objects of interest throughout the given video sequence …

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

Universal instance perception as object discovery and retrieval

B Yan, Y Jiang, J Wu, D Wang, P Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …

Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model

HK Cheng, AG Schwing - European Conference on Computer Vision, 2022 - Springer
We present XMem, a video object segmentation architecture for long videos with unified
feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video …

Segment and track anything

Y Cheng, L Li, Y Xu, X Li, Z Yang, W Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
This report presents a framework called Segment And Track Anything (SAMTrack) that
allows users to precisely and effectively segment and track any object in a video …

MOSE: A new dataset for video object segmentation in complex scenes

H Ding, C Liu, S He, X Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …

Visual semantic segmentation based on few/zero-shot learning: An overview

W Ren, Y Tang, Q Sun, C Zhao… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org
Visual semantic segmentation aims at separating a visual sample into diverse blocks with
specific semantic attributes and identifying the category for each block, and it plays a crucial …

Lavt: Language-aware vision transformer for referring image segmentation

Z Yang, J Wang, Y Tang, K Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Referring image segmentation is a fundamental vision-language task that aims to segment
out an object referred to by a natural language expression from an image. One of the key …

Dropmae: Masked autoencoders with spatial-attention dropout for tracking tasks

Q Wu, T Yang, Z Liu, B Wu, Y Shan… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we study masked autoencoder (MAE) pretraining on videos for matching-
based downstream tasks, including visual object tracking (VOT) and video object …

Towards grand unification of object tracking

B Yan, Y Jiang, P Sun, D Wang, Z Yuan, P Luo… - European Conference on …, 2022 - Springer
We present a unified method, termed Unicorn, that can simultaneously solve four tracking
problems (SOT, MOT, VOS, MOTS) with a single network using the same model parameters …