From Image to Video: An Empirical Study of Diffusion Representations

P Vélez, LF Polanía, Y Yang, C Zhang, R Kabra… - arxiv preprint arxiv …, 2025 - arxiv.org
Diffusion models have revolutionized generative modeling, enabling unprecedented realism
in image and video synthesis. This success has sparked interest in leveraging their …

SMITE: Segment Me In TimE

A Alimohammadi, S Nag, SA Taghanaki… - arxiv preprint arxiv …, 2024 - arxiv.org
Segmenting an object in a video presents significant challenges. Each pixel must be
accurately labelled, and these labels must remain consistent across frames. The difficulty …

Video Diffusion Models Learn the Structure of the Dynamic World

Z Bao, A Bagchi, YX Wang, P Tokmakov, M Hebert - openreview.net
Diffusion models have demonstrated significant progress in visual perception tasks due to
their ability to capture fine-grained, object-centric features through large-scale vision …

Temporal Prompting Matters: Rethinking Referring Video Object Segmentation

CS Lin, MH Chen, IJ Liu, CY Wang, S Liu, YCF Wang - openreview.net
Referring Video Object Segmentation (RVOS) aims to segment the object referred to by the
query sentence in the video. Most existing methods require end-to-end training with dense …

SMITE: Segment Me In TimE

AF Exemplars - openreview.net
Segmenting an object in a video presents significant challenges. Each pixel must be
accurately labeled, and these labels must remain consistent across frames. The difficulty …