- Academic Search

Z Sun, Q Ke, H Rahmani, M Bennamoun… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Human Action Recognition (HAR) aims to understand human behavior and assign a label to
each action. It has a wide range of applications, and therefore has been attracting increasing …

Spara Citera Citerat av 639 Relaterade artiklar Alla 16 versionerna

[Free GPT-4]

[PDF] acm.org

Video description: A survey of methods, datasets, and evaluation metrics

N Aafaq, A Mian, W Liu, SZ Gilani, M Shah - ACM Computing Surveys …, 2019 - dl.acm.org

Video description is the automatic generation of natural language sentences that describe
the contents of a given video. It has applications in human-robot interaction, hel** the …

Spara Citera Citerat av 256 Relaterade artiklar Alla 10 versionerna

[Free GPT-4]

[PDF] thecvf.com

Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives

K Grauman, A Westbury, L Torresani… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …

Spara Citera Citerat av 131 Relaterade artiklar Alla 5 versionerna Se som HTML-version

[Free GPT-4]

[PDF] arxiv.org

Videoclip: Contrastive pre-training for zero-shot video-text understanding

H Xu, G Ghosh, PY Huang, D Okhonko… - ar** from natural language instructions and egocentric …

Spara Citera Citerat av 816 Relaterade artiklar Alla 11 versionerna Se som HTML-version

[Free GPT-4]

[PDF] thecvf.com

Howto100m: Learning a text-video embedding by watching hundred million narrated video clips

A Miech, D Zhukov, JB Alayrac… - Proceedings of the …, 2019 - openaccess.thecvf.com

Learning text-video embeddings usually requires a dataset of video clips with manually
provided captions. However, such datasets are expensive and time consuming to create and …

Spara Citera Citerat av 1311 Relaterade artiklar Alla 10 versionerna Se som HTML-version

Citera

Avancerad sökning

Har sparats i Mitt bibliotek

Human action recognition from various data modalities: A review

Video description: A survey of methods, datasets, and evaluation metrics

Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives

Videoclip: Contrastive pre-training for zero-shot video-text understanding

Howto100m: Learning a text-video embedding by watching hundred million narrated video clips