- Academic Search

Deep learning-based action detection in untrimmed videos: A survey

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org

Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

Save Cite Cited by 75 Related articles All 8 versions Free GPT-4 DeepSeek

Star-transformer: a spatio-temporal cross attention transformer for human action recognition

D Ahn, S Kim, H Hong, BC Ko - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

In action recognition, although the combination of spatio-temporal videos and skeleton
features can improve the recognition performance, a separate model and balancing feature …

Save Cite Cited by 161 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

Videollm: Modeling video sequence with large language models

G Chen, YD Zheng, J Wang, J Xu, Y Huang… - arxiv preprint arxiv …, 2023 - arxiv.org

With the exponential growth of video data, there is an urgent need for automated technology
to analyze and comprehend video content. However, existing video understanding models …

Save Cite Cited by 84 Related articles All 2 versions Free GPT-4 DeepSeek View as HTML

Video transformers: A survey

J Selva, AS Johansen, S Escalera… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Transformer models have shown great success handling long-range interactions, making
them a promising tool for modeling video. However, they lack inductive biases and scale …

Save Cite Cited by 137 Related articles All 8 versions Free GPT-4 DeepSeek

Hybrid relation guided set matching for few-shot action recognition

X Wang, S Zhang, Z Qing, M Tang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Current few-shot action recognition methods reach impressive performance by learning
discriminative features for each video via episodic training and designing various temporal …

Save Cite Cited by 108 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

Molo: Motion-augmented long-short contrastive learning for few-shot action recognition

X Wang, S Zhang, Z Qing, C Gao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Current state-of-the-art approaches for few-shot action recognition achieve promising
performance by conducting frame-level matching on learned visual features. However, they …

Save Cite Cited by 69 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

Flow-guided transformer for video inpainting

K Zhang, J Fu, D Liu - European Conference on Computer Vision, 2022 - Springer

We propose a flow-guided transformer, which innovatively leverage the motion discrepancy
exposed by optical flows to instruct the attention retrieval in transformer for high fidelity video …

Save Cite Cited by 74 Related articles All 5 versions Free GPT-4 DeepSeek

Real-time online video detection with temporal smoothing transformers

Y Zhao, P Krähenbühl - European Conference on Computer Vision, 2022 - Springer

Streaming video recognition reasons about objects and their actions in every frame of a
video. A good streaming recognition model captures both long-term dynamics and short …

Save Cite Cited by 62 Related articles All 9 versions Free GPT-4 DeepSeek