الباحث العلمي من Google

حفظ اقتباس تم اقتباسها في عدد: 71 مقالات ذات صلة الإصدارات الـ 8كلها

Temporal action segmentation: An analysis of modern techniques‏

G Ding, F Sener, A Yao - IEEE Transactions on Pattern Analysis …, 2023‏ - ieeexplore.ieee.org‏

Temporal action segmentation (TAS) in videos aims at densely identifying video frames in
minutes-long videos with multiple action classes. As a long-range video understanding task …‏

حفظ اقتباس تم اقتباسها في عدد: 182 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

Moviechat: From dense token to sparse memory for long video understanding‏

E Song, W Chai, G Wang, Y Zhang… - Proceedings of the …, 2024‏ - openaccess.thecvf.com‏

Recently integrating video foundation models and large language models to build a video
understanding system can overcome the limitations of specific pre-defined vision tasks. Yet …‏

حفظ اقتباس تم اقتباسها في عدد: 211 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

Assembly101: A large-scale multi-view video dataset for understanding procedural activities‏

F Sener, D Chatterjee, D Shelepov… - Proceedings of the …, 2022‏ - openaccess.thecvf.com‏

Assembly101 is a new procedural activity dataset featuring 4321 videos of people
assembling and disassembling 101" take-apart" toy vehicles. Participants work without fixed …‏

حفظ اقتباس تم اقتباسها في عدد: 369 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

Next-qa: Next phase of question-answering to explaining temporal actions‏

J **ao, X Shang, A Yao… - Proceedings of the IEEE …, 2021‏ - openaccess.thecvf.com‏

We introduce NExT-QA, a rigorously designed video question answering (VideoQA)
benchmark to advance video understanding from describing to explaining the temporal …‏

حفظ اقتباس تم اقتباسها في عدد: 251 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

Anticipative video transformer‏

R Girdhar, K Grauman - Proceedings of the IEEE/CVF …, 2021‏ - openaccess.thecvf.com‏

Abstract We propose Anticipative Video Transformer (AVT), an end-to-end attention-based
video modeling architecture that attends to the previously observed video in order to …‏

حفظ اقتباس تم اقتباسها في عدد: 83 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

Diffusion action segmentation‏

D Liu, Q Li, AD Dinh, T Jiang… - Proceedings of the …, 2023‏ - openaccess.thecvf.com‏

Temporal action segmentation is crucial for understanding long-form videos. Previous works
on this task commonly adopt an iterative refinement paradigm by using multi-stage models …‏

حفظ اقتباس تم اقتباسها في عدد: 84 مقالات ذات صلة الإصدارات الـ 2كلها إصدار HTML‏

Videollm: Modeling video sequence with large language models‏

G Chen, YD Zheng, J Wang, J Xu, Y Huang… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

With the exponential growth of video data, there is an urgent need for automated technology
to analyze and comprehend video content. However, existing video understanding models …‏

حفظ اقتباس تم اقتباسها في عدد: 244 مقالات ذات صلة الإصدارات الـ 2كلها إصدار HTML‏

A comprehensive study of deep video action recognition‏

Y Zhu, X Li, C Liu, M Zolfaghari, Y **ong, C Wu… - arxiv preprint arxiv …, 2020‏ - arxiv.org‏

Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …‏