- Academic Search

Y Lu, R Quan, L Zhu, Y Yang - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org

Video grounding, the process of identifying a specific moment in an untrimmed video based
on a natural language query, has become a popular topic in video understanding. However …

Simpan Kutip Dirujuk 12 kali Artikel terkait 5 versi

Relative-position embedding based spatially and temporally decoupled Transformer for action recognition

Y Ma, R Wang - Pattern Recognition, 2024 - Elsevier

Recognition of human actions is to classify actions in a video. Recently, Vision Transformer
(ViT) has been applied to action recognition. However, the Vision Transformer is unsuitable …

Simpan Kutip Dirujuk 26 kali Artikel terkait 3 versi

Multi-frame super-resolution of remote sensing images using attention-based GAN models

P Wang, E Sertel - Knowledge-Based Systems, 2023 - Elsevier

Multi-frame super-resolution (MFSR) of remote sensing (RS) imageries becomes a critical
research topic with the launch of new satellites having video capturing capability and the …

Simpan Kutip Dirujuk 21 kali Artikel terkait 2 versi

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Action sensitivity learning for temporal action localization

J Shao, X Wang, R Quan, J Zheng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Temporal action localization (TAL), which involves recognizing and locating action
instances, is a challenging task in video understanding. Most existing approaches directly …

Simpan Kutip Dirujuk 26 kali Artikel terkait 6 versi Versi HTML

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Align and tell: Boosting text-video retrieval with local alignment and fine-grained supervision

Zero-shot video grounding with pseudo query lookup and verification

Relative-position embedding based spatially and temporally decoupled Transformer for action recognition

Multi-frame super-resolution of remote sensing images using attention-based GAN models

Action sensitivity learning for temporal action localization