Sam 2: Segment anything in images and videos
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …
promptable visual segmentation in images and videos. We build a data engine, which …
Anydoor: Zero-shot object-level image customization
This work presents AnyDoor a diffusion-based image generator with the power to teleport
target objects to new scenes at user-specified locations with desired shapes. Instead of …
target objects to new scenes at user-specified locations with desired shapes. Instead of …
Sequential modeling enables scalable learning for large vision models
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …
Model (LVM) without making use of any linguistic data. To do this we define a common …
Tracking anything with decoupled video segmentation
Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …
Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
MOSE: A new dataset for video object segmentation in complex scenes
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …
OMG-Seg: Is one model good enough for all segmentation?
In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …
Draganything: Motion control for anything using entity representation
We introduce DragAnything, which utilizes a entity representation to achieve motion control
for any object in controllable video generation. Comparison to existing motion control …
for any object in controllable video generation. Comparison to existing motion control …
Tube-Link: A flexible cross tube framework for universal video segmentation
Video segmentation aims to segment and track every pixel in diverse scenarios accurately.
In this paper, we present Tube-Link, a versatile framework that addresses multiple core tasks …
In this paper, we present Tube-Link, a versatile framework that addresses multiple core tasks …
Video k-net: A simple, strong, and unified baseline for video segmentation
This paper presents Video K-Net, a simple, strong, and unified framework for fully end-to-
end video panoptic segmentation. The method is built upon K-Net, a method that unifies …
end video panoptic segmentation. The method is built upon K-Net, a method that unifies …