Μελετητής Google

Αποθήκευση Παράθεση Γίνεται αναφορά σε 75 Σχετικά άρθρα Όλες οι 8 εκδοχές

Deep learning-based action detection in untrimmed videos: A survey

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org

Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 381 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

Videomae v2: Scaling video masked autoencoders with dual masking

L Wang, B Huang, Z Zhao, Z Tong… - Proceedings of the …, 2023 - openaccess.thecvf.com

Scale is the primary factor for building a powerful foundation model that could well
generalize to a variety of downstream tasks. However, it is still challenging to train video …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 162 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

Tridet: Temporal action detection with relative boundary modeling

D Shi, Y Zhong, Q Cao, L Ma, J Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

In this paper, we present a one-stage framework TriDet for temporal action detection.
Existing methods often suffer from imprecise boundary predictions due to the ambiguous …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 987 Σχετικά άρθρα Όλες οι 13 εκδοχές Προβολή ως HTML

Ego4d: Around the world in 3,000 hours of egocentric video

K Grauman, A Westbury, E Byrne… - Proceedings of the …, 2022 - openaccess.thecvf.com

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It
offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 421 Σχετικά άρθρα Όλες οι 7 εκδοχές

Actionformer: Localizing moments of actions with transformers

CL Zhang, J Wu, Y Li - European Conference on Computer Vision, 2022 - Springer

Self-attention based Transformer models have demonstrated impressive results for image
classification and object detection, and more recently for video understanding. Inspired by …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 424 Σχετικά άρθρα Όλες οι 6 εκδοχές

Prompting visual-language models for efficient video understanding

C Ju, T Han, K Zheng, Y Zhang, W **e - European Conference on …, 2022 - Springer

Image-based visual-language (I-VL) pre-training has shown great success for learning joint
visual-textual representations from large-scale web data, revealing remarkable ability for …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 316 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

Learning salient boundary feature for anchor-free temporal action localization

C Lin, C Xu, D Luo, Y Wang, Y Tai… - Proceedings of the …, 2021 - openaccess.thecvf.com

Temporal action localization is an important yet challenging task in video understanding.
Typically, such a task aims at inferring both the action category and localization of the start …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 266 Σχετικά άρθρα Όλες οι 5 εκδοχές

End-to-end temporal action detection with transformer

X Liu, Q Wang, Y Hu, X Tang, S Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Temporal action detection (TAD) aims to determine the semantic label and the temporal
interval of every action instance in an untrimmed video. It is a fundamental and challenging …