Ego4d: Around the world in 3,000 hours of egocentric video

K Grauman, A Westbury, E Byrne… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It
offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …

Rescaling egocentric vision: Collection, pipeline and challenges for epic-kitchens-100

D Damen, H Doughty, GM Farinella, A Furnari… - International Journal of …, 2022 - Springer
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-
KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M …

Epic-fusion: Audio-visual temporal binding for egocentric action recognition

E Kazakos, A Nagrani, A Zisserman… - Proceedings of the …, 2019 - openaccess.thecvf.com
We focus on multi-modal fusion for egocentric action recognition, and propose a novel
architecture for multi-modal temporal-binding, ie the combination of modalities within a …

The epic-kitchens dataset: Collection, challenges and baselines

D Damen, H Doughty, GM Farinella… - … on Pattern Analysis …, 2020 - ieeexplore.ieee.org
Since its introduction in 2018, EPIC-KITCHENS has attracted attention as the largest
egocentric video benchmark, offering a unique viewpoint on people's interaction with …

Scaling egocentric vision: The epic-kitchens dataset

D Damen, H Doughty, GM Farinella… - Proceedings of the …, 2018 - openaccess.thecvf.com
First-person vision is gaining interest as it offers a unique viewpoint on people's interaction
with objects, their attention, and even intention. However, progress in this challenging …

On semantic similarity in video retrieval

M Wray, H Doughty, D Damen - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Current video retrieval efforts all found their evaluation on an instance-based assumption,
that only a single caption is relevant to a query video and vice versa. We demonstrate that …

Diagnosing error in temporal action detectors

H Alwassel, FC Heilbron, V Escorcia… - Proceedings of the …, 2018 - openaccess.thecvf.com
Despite the recent progress in video understanding and the continuous rate of improvement
in temporal action localization throughout the years, it is still unclear how far (or close?) we …

[HTML][HTML] Egocentric vision-based action recognition: A survey

A Núñez-Marcos, G Azkune, I Arganda-Carreras - Neurocomputing, 2022 - Elsevier
The egocentric action recognition EAR field has recently increased its popularity due to the
affordable and lightweight wearable cameras available nowadays such as GoPro and …

Basictad: an astounding rgb-only baseline for temporal action detection

M Yang, G Chen, YD Zheng, T Lu, L Wang - Computer Vision and Image …, 2023 - Elsevier
Temporal action detection (TAD) is extensively studied in the video understanding
community by generally following the object detection pipeline in images. However, complex …

A generalized and robust framework for timestamp supervision in temporal action segmentation

R Rahaman, D Singhania, A Thiery, A Yao - European Conference on …, 2022 - Springer
In temporal action segmentation, Timestamp Supervision requires only a handful of labelled
frames per video sequence. For unlabelled frames, previous works rely on assigning hard …