Referring atomic video action recognition

K Peng, J Fu, K Yang, D Wen, Y Chen, R Liu… - … on Computer Vision, 2024 - Springer
We introduce a new task called R eferring A tomic V ideo A ction R ecognition (RAVAR),
aimed at identifying atomic actions of a particular person based on a textual description and …

[HTML][HTML] Deep Feature-Based Hyperspectral Object Tracking: An Experimental Survey and Outlook

Y Wang, X Li, X Yang, F Ge, B Wei, L Li, S Yue - Remote Sensing, 2025 - mdpi.com
With the rapid advancement of hyperspectral imaging technology, hyperspectral object
tracking (HOT) has become a research hotspot in the field of remote sensing. Advanced …

AudioScenic: Audio-Driven Video Scene Editing

K Shen, R Quan, L Zhu, J ** Referring Multi-Object Tracking
Y Zhang, D Wu, W Han, X Dong - arxiv preprint arxiv:2406.05039, 2024 - arxiv.org
Referring multi-object tracking (RMOT) aims at detecting and tracking multiple objects
following human instruction represented by a natural language expression. Existing RMOT …

Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities

Y Li, Y Li, Y Guo, B Ren, Z Xu, H Guo, H Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
In speaker tracking research, integrating and complementing multi-modal data is a crucial
strategy for improving the accuracy and robustness of tracking systems. However, tracking …

Multi-granularity Localization Transformer with Collaborative Understanding for Referring Multi-Object Tracking

J Chen, J Lin, G Zhong, Y Yao… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org
As an essential component of Vision-Based Measurement (VBM), Referring Multi-Object
Tracking (RMOT) involves localizing and tracking specific objects in video frames using …