Single-model and any-modality for video object tracking

Z Wu, J Zheng, X Ren, FA Vasluianu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the realm of video object tracking auxiliary modalities such as depth thermal or event data
have emerged as valuable assets to complement the RGB trackers. In practice most existing …

MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts

X Feng, X Li, S Hu, D Zhang, J Zhang… - Advances in …, 2025 - proceedings.neurips.cc
Vision-language tracking (VLT) enhances traditional visual object tracking by integrating
language descriptions, requiring the tracker to flexibly understand complex and diverse text …

Chattracker: Enhancing visual tracking performance via chatting with multimodal large language model

Y Sun, F Yu, S Chen, Y Zhang… - Advances in …, 2025 - proceedings.neurips.cc
Visual object tracking aims to locate a targeted object in a video sequence based on an
initial bounding box. Recently, Vision-Language~(VL) trackers have proposed to utilize …

Autogenic language embedding for coherent point tracking

Z Song, Y Tang, R Luo, L Ma, J Yu, YPP Chen… - Proceedings of the …, 2024 - dl.acm.org
Point tracking is a challenging task in computer vision, aiming to establish point-wise
correspondence across long video sequences. Recent advancements have primarily …

Beyond visual cues: Synchronously exploring target-centric semantics for vision-language tracking

J Ge, X Chen, J Cao, X Zhu, B Liu - arxiv preprint arxiv:2311.17085, 2023 - arxiv.org
Single object tracking aims to locate one specific target in video sequences, given its initial
state. Classical trackers rely solely on visual cues, restricting their ability to handle …

WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark

C Zhang, L Liu, G Huang, H Wen, X Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
Underwater object tracking (UOT) is a foundational task for identifying and tracing
submerged entities in underwater video sequences. However, current UOT datasets suffer …

Visual Object Tracking across Diverse Data Modalities: A Review

M Wang, T Ma, S **n, X Hou, J **ng, G Dai… - arxiv preprint arxiv …, 2024 - arxiv.org
Visual Object Tracking (VOT) is an attractive and significant research area in computer
vision, which aims to recognize and track specific targets in video sequences where the …

Towards underwater camouflaged object tracking: An experimental evaluation of sam and sam 2

C Zhang, L Liu, G Huang, H Wen, X Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
Over the past decade, significant progress has been made in visual object tracking, largely
due to the availability of large-scale training datasets. However, existing tracking datasets …

MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking

C Zhang, L Liu, H Wen, X Zhou, Y Wang - arxiv preprint arxiv:2411.15761, 2024 - arxiv.org
Night unmanned aerial vehicle (UAV) tracking is impeded by the challenges of poor
illumination, with previous daylight-optimized methods demonstrating suboptimal …

Zone-YOLO: Vision-Language Object Detection Using Zone Prompt

J Yang, N Jia, X Liu, R Fan, Y Sun… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Object detection in complex traffic scenarios is crucial for Intelligent Transportation Systems
(ITS). At present, most real-time traffic object detection methods primarily rely on YOLO-style …