Test-time training with masked autoencoders

Y Gandelsman, Y Sun, X Chen… - Advances in Neural …, 2022 - proceedings.neurips.cc
Test-time training adapts to a new test distribution on the fly by optimizing a model for each
test input using self-supervision. In this paper, we use masked autoencoders for this one …

Ttt++: When does self-supervised test-time training fail or thrive?

Y Liu, P Kothari, B Van Delft… - Advances in …, 2021 - proceedings.neurips.cc
Test-time training (TTT) through self-supervised learning (SSL) is an emerging paradigm to
tackle distributional shifts. Despite encouraging results, it remains unclear when this …

Minvis: A minimal video instance segmentation framework without video-based training

DA Huang, Z Yu, A Anandkumar - Advances in Neural …, 2022 - proceedings.neurips.cc
We propose MinVIS, a minimal video instance segmentation (VIS) framework that achieves
state-of-the-art VIS performance with neither video-based architectures nor training …

A survey on deep learning technique for video segmentation

T Zhou, F Porikli, DJ Crandall… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Video segmentation—partitioning video frames into multiple segments or objects—plays a
critical role in a broad range of practical applications, from enhancing visual effects in movie …

Do different tracking tasks require different appearance models?

Z Wang, H Zhao, YL Li, S Wang… - Advances in Neural …, 2021 - proceedings.neurips.cc
Tracking objects of interest in a video is one of the most popular and widely applicable
problems in computer vision. However, with the years, a Cambrian explosion of use cases …

Dynamically instance-guided adaptation: A backward-free approach for test-time domain adaptive semantic segmentation

W Wang, Z Zhong, W Wang, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we study the application of Test-time domain adaptation in semantic
segmentation (TTDA-Seg) where both efficiency and effectiveness are crucial. Existing …

Mask-free video instance segmentation

L Ke, M Danelljan, H Ding, YW Tai… - Proceedings of the …, 2023 - openaccess.thecvf.com
The recent advancement in Video Instance Segmentation (VIS) has largely been driven by
the use of deeper and increasingly data-hungry transformer-based models. However, video …

End-to-end 3d tracking with decoupled queries

Y Li, Z Yu, J Philion, A Anandkumar… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this work, we present an end-to-end framework for camera-based 3D multi-object tracking,
called DQTrack. To avoid heuristic design in detection-based trackers, recent query-based …

A gated attention transformer for multi-person pose tracking

A Doering, J Gall - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Multi-person pose tracking is an important element for many applications and requires to
estimate the human poses of all persons in a video and to track them over time. The …

What is Point Supervision Worth in Video Instance Segmentation?

S Huang, DA Huang, Z Yu, S Lan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video instance segmentation (VIS) is a challenging vision task that aims to detect segment
and track objects in videos. Conventional VIS methods rely on densely annotated object …