Temporal action segmentation: An analysis of modern techniques
Temporal action segmentation (TAS) in videos aims at densely identifying video frames in
minutes-long videos with multiple action classes. As a long-range video understanding task …
minutes-long videos with multiple action classes. As a long-range video understanding task …
Deep learning-based action detection in untrimmed videos: A survey
Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …
applications, and is critical for video analysis. Despite the progress of action recognition …
Assembly101: A large-scale multi-view video dataset for understanding procedural activities
F Sener, D Chatterjee, D Shelepov… - Proceedings of the …, 2022 - openaccess.thecvf.com
Assembly101 is a new procedural activity dataset featuring 4321 videos of people
assembling and disassembling 101" take-apart" toy vehicles. Participants work without fixed …
assembling and disassembling 101" take-apart" toy vehicles. Participants work without fixed …
Howto100m: Learning a text-video embedding by watching hundred million narrated video clips
Learning text-video embeddings usually requires a dataset of video clips with manually
provided captions. However, such datasets are expensive and time consuming to create and …
provided captions. However, such datasets are expensive and time consuming to create and …
Temporal cycle-consistency learning
We introduce a self-supervised representation learning method based on the task of
temporal alignment between videos. The method trains a network using temporal cycle …
temporal alignment between videos. The method trains a network using temporal cycle …
Collaborative learning of semi-supervised segmentation and classification for medical images
Medical image analysis has two important research areas: disease grading and fine-grained
lesion segmentation. Although the former problem often relies on the latter, the two are …
lesion segmentation. Although the former problem often relies on the latter, the two are …
Cross-task weakly supervised learning from instructional videos
In this paper we investigate learning visual models for the steps of ordinary tasks using weak
supervision via instructional narrations and an ordered list of steps instead of strong …
supervision via instructional narrations and an ordered list of steps instead of strong …
Temporal aggregate representations for long-range video understanding
Future prediction, especially in long-range videos, requires reasoning from current and past
observations. In this work, we address questions of temporal extent, scaling, and level of …
observations. In this work, we address questions of temporal extent, scaling, and level of …
Improving action segmentation via graph-based temporal reasoning
Temporal relations among multiple action segments play an important role in action
segmentation especially when observations are limited (eg, actions are occluded by other …
segmentation especially when observations are limited (eg, actions are occluded by other …
Tl; dw? summarizing instructional videos with task relevance and cross-modal saliency
YouTube users looking for instructions for a specific task may spend a long time browsing
content trying to find the right video that matches their needs. Creating a visual summary …
content trying to find the right video that matches their needs. Creating a visual summary …