Vidchapters-7m: Video chapters at scale

A Yang, A Nagrani, I Laptev, J Sivic… - Advances in Neural …, 2023 - proceedings.neurips.cc
Segmenting untrimmed videos into chapters enables users to quickly navigate to the
information of their interest. This important topic has been understudied due to the lack of …

Efficient movie scene detection using state-space transformers

MM Islam, M Hasan, KS Athrey… - Proceedings of the …, 2023 - openaccess.thecvf.com
The ability to distinguish between different movie scenes is critical for understanding the
storyline of a movie. However, accurately detecting movie scenes is often challenging as it …

Towards global video scene segmentation with context-aware transformer

Y Yang, Y Huang, W Guo, B Xu, D **a - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Videos such as movies or TV episodes usually need to divide the long storyline into
cohesive units, ie, scenes, to facilitate the understanding of video semantics. The key …

How You Feelin'? Learning Emotions and Mental States in Movie Scenes

D Srivastava, AK Singh… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Movie story analysis requires understanding characters' emotions and mental states.
Towards this goal, we formulate emotion understanding as predicting a diverse and multi …

Uboco: Unsupervised boundary contrastive learning for generic event boundary detection

H Kang, J Kim, T Kim, SJ Kim - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Abstract Generic Event Boundary Detection (GEBD) is a newly suggested video
understanding task that aims to find one level deeper semantic boundaries of events …

Newsnet: A novel dataset for hierarchical temporal segmentation

H Wu, K Chen, H Liu, M Zhuge, B Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Temporal video segmentation is the get-to-go automatic video analysis, which decomposes
a long-form video into smaller components for the following-up understanding tasks. Recent …

Videollamb: Long-context video understanding with recurrent memory bridges

Y Wang, C **e, Y Liu, Z Zheng - arxiv preprint arxiv:2409.01071, 2024 - arxiv.org
Recent advancements in large-scale video-language models have shown significant
potential for real-time planning and detailed interactions. However, their high computational …

Scene consistency representation learning for video scene segmentation

H Wu, K Chen, Y Luo, R Qiao, B Ren… - Proceedings of the …, 2022 - openaccess.thecvf.com
A long-term video, such as a movie or TV show, is composed of various scenes, each of
which represents a series of shots sharing the same semantic story. Spotting the correct …

Characters link shots: Character attention network for movie scene segmentation

J Tan, H Wang, J Yuan - ACM Transactions on Multimedia Computing …, 2023 - dl.acm.org
Movie scene segmentation aims to automatically segment a movie into multiple story units,
ie, scenes, each of which is a series of semantically coherent and time-continual shots …

Self-supervised pretraining for stereoscopic image super-resolution with parallax-aware masking

Z Zhang, J Lei, B Peng, J Zhu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Most existing learning-based methods for stereoscopic image super-resolution rely on a
great number of high-resolution stereoscopic images as labels. To alleviate the problem of …