Zero-shot video grounding with pseudo query lookup and verification

Y Lu, R Quan, L Zhu, Y Yang - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org
Video grounding, the process of identifying a specific moment in an untrimmed video based
on a natural language query, has become a popular topic in video understanding. However …

Relative-position embedding based spatially and temporally decoupled Transformer for action recognition

Y Ma, R Wang - Pattern Recognition, 2024 - Elsevier
Recognition of human actions is to classify actions in a video. Recently, Vision Transformer
(ViT) has been applied to action recognition. However, the Vision Transformer is unsuitable …

Multi-frame super-resolution of remote sensing images using attention-based GAN models

P Wang, E Sertel - Knowledge-Based Systems, 2023 - Elsevier
Multi-frame super-resolution (MFSR) of remote sensing (RS) imageries becomes a critical
research topic with the launch of new satellites having video capturing capability and the …

Action sensitivity learning for temporal action localization

J Shao, X Wang, R Quan, J Zheng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Temporal action localization (TAL), which involves recognizing and locating action
instances, is a challenging task in video understanding. Most existing approaches directly …