Towards global video scene segmentation with context-aware transformer

Y Yang, Y Huang, W Guo, B Xu, D **a - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Videos such as movies or TV episodes usually need to divide the long storyline into
cohesive units, ie, scenes, to facilitate the understanding of video semantics. The key …

Multimodal high-order relation transformer for scene boundary detection

X Wei, Z Shi, T Zhang, X Yu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Scene boundary detection breaks down long videos into meaningful story-telling units and
plays a crucial role in high-level video understanding. Despite significant advancements in …

Newsnet: A novel dataset for hierarchical temporal segmentation

H Wu, K Chen, H Liu, M Zhuge, B Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Temporal video segmentation is the get-to-go automatic video analysis, which decomposes
a long-form video into smaller components for the following-up understanding tasks. Recent …

Collaborative noisy label cleaner: Learning scene-aware trailers for multi-modal highlight detection in movies

B Gan, X Shu, R Qiao, H Wu, K Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Movie highlights stand out of the screenplay for efficient browsing and play a crucial role on
social media platforms. Based on existing efforts, this work has two observations:(1) For …

Characters link shots: character attention network for movie scene segmentation

J Tan, H Wang, J Yuan - ACM Transactions on Multimedia Computing …, 2023 - dl.acm.org
Movie scene segmentation aims to automatically segment a movie into multiple story units,
ie, scenes, each of which is a series of semantically coherent and time-continual shots …

Video Scene Detection Using Transformer Encoding Linker Network (TELNet)

SM Tseng, ZT Yeh, CY Wu, JB Chang, M Norouzi - Sensors, 2023 - mdpi.com
This paper introduces a transformer encoding linker network (TELNet) for automatically
identifying scene boundaries in videos without prior knowledge of their structure. Videos …

Movies2Scenes: Using movie metadata to learn scene representation

S Chen, CH Liu, X Hao, X Nie… - Proceedings of the …, 2023 - openaccess.thecvf.com
Understanding scenes in movies is crucial for a variety of applications such as video
moderation, search, and recommendation. However, labeling individual scenes is a time …

Neighbor Relations Matter in Video Scene Detection

J Tan, H Wang, J Li, Z Ou… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Video scene detection aims to temporally link shots for obtaining semantically compact
scenes. It is essential for this task to capture scene-distinguishable affinity among shots by …

SPICA: Interactive Video Content Exploration through Augmented Audio Descriptions for Blind or Low-Vision Viewers

Z Ning, BL Wimer, K Jiang, K Chen, J Ban… - Proceedings of the CHI …, 2024 - dl.acm.org
Blind or Low-Vision (BLV) users often rely on audio descriptions (AD) to access video
content. However, conventional static ADs can leave out detailed information in videos …

Mega: Multimodal alignment aggregation and distillation for cinematic video segmentation

N Sadoughi, X Li, A Vajpayee, D Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Previous research has studied the task of segmenting cinematic videos into scenes and into
narrative acts. However, these studies have overlooked the essential task of multimodal …