Zero-shot video grounding with pseudo query lookup and verification

Y Lu, R Quan, L Zhu, Y Yang - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org
Video grounding, the process of identifying a specific moment in an untrimmed video based
on a natural language query, has become a popular topic in video understanding. However …

Relative-position embedding based spatially and temporally decoupled Transformer for action recognition

Y Ma, R Wang - Pattern Recognition, 2024 - Elsevier
Recognition of human actions is to classify actions in a video. Recently, Vision Transformer
(ViT) has been applied to action recognition. However, the Vision Transformer is unsuitable …

Multi-frame super-resolution of remote sensing images using attention-based GAN models

P Wang, E Sertel - Knowledge-Based Systems, 2023 - Elsevier
Multi-frame super-resolution (MFSR) of remote sensing (RS) imageries becomes a critical
research topic with the launch of new satellites having video capturing capability and the …

Deep multimodal representation learning for generalizable person re-identification

S **ang, H Chen, W Ran, Z Yu, T Liu, D Qian, Y Fu - Machine Learning, 2024 - Springer
Person re-identification plays a significant role in realistic scenarios due to its various
applications in public security and video surveillance. Recently, leveraging the supervised …