Holmes-vad: Towards unbiased and explainable video anomaly detection via multi-modal llm

H Zhang, X Xu, X Wang, J Zuo, C Han, X Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
Towards open-ended Video Anomaly Detection (VAD), existing methods often exhibit
biased detection when faced with challenging or unseen events and lack interpretability. To …

Glancevad: Exploring glance supervision for label-efficient video anomaly detection

H Zhang, X Wang, X Xu, X Huang, C Han… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, video anomaly detection has been extensively investigated in both
unsupervised and weakly supervised settings to alleviate costly temporal labeling. Despite …

Holmes-vau: Towards long-term video anomaly understanding at any granularity

H Zhang, X Xu, X Wang, J Zuo, X Huang, C Gao… - arxiv preprint arxiv …, 2024 - arxiv.org
How can we enable models to comprehend video anomalies occurring over varying
temporal scales and contexts? Traditional Video Anomaly Understanding (VAU) methods …

Stepwise Multi-grained Boundary Detector for Point-Supervised Temporal Action Localization

M Liu, L Wang, S Zhou, K **a, Q Wu, Q Zhang… - … on Computer Vision, 2024 - Springer
Point-supervised temporal action localization pursues high-accuracy action detection under
low-cost data annotation. Despite recent advances, a significant challenge remains: sparse …

Click-level supervision for online action detection extended from SCOAD

X Zhang, Y Mei, Y Na, XL Lin, G Bian, Q Yan… - Future Generation …, 2025 - Elsevier
Data-driven fully-supervised online action detection algorithms heavily rely on manual
annotations, which are challenging to obtain in real-world applications. Current research …

Semi‐supervised pipe video temporal defect interval localization

Z Huang, G Pan, C Kang, YZ Lv - Computer‐Aided Civil and …, 2024 - Wiley Online Library
In sewer pipe closed‐circuit television inspection, accurate temporal defect localization is
essential for effective pipe assessment. Industry standards typically do not require time …

Training-Free Zero-Shot Temporal Action Detection with Vision-Language Models

C Han, H Wang, J Kuang, L Zhang, J Gui - arxiv preprint arxiv:2501.13795, 2025 - arxiv.org
Existing zero-shot temporal action detection (ZSTAD) methods predominantly use fully
supervised or unsupervised strategies to recognize unseen activities. However, these …

SQL-Net: Semantic Query Learning for Point-Supervised Temporal Action Localization

Y Wang, S Zhao, S Chen - IEEE Transactions on Multimedia, 2024 - ieeexplore.ieee.org
Point-supervised Temporal Action Localization (PS-TAL) detects temporal intervals of
actions in untrimmed videos with a label-efficient paradigm. However, most existing methods …

MKP-Net: Memory knowledge propagation network for point-supervised temporal action localization in livestreaming

L Chen, J Zhang, Y Zhang, J Kang, L Zhuo - Computer Vision and Image …, 2024 - Elsevier
Standardized regulation of livestreaming is an important element of cyberspace governance.
Temporal action localization (TAL) can localize the occurrence of specific actions to better …

Snippet-inter Difference Attention Network for Weakly-supervised Temporal Action Localization

W Zhou, K Lin, W Hu, C **e, T Su… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org
The purpose of weakly-supervised temporal action localization (WTAL) task is to
simultaneously classify and localize action instances in untrimmed videos with only video …