An explainable and efficient deep learning framework for video anomaly detection

C Wu, S Shao, C Tunc, P Satam, S Hariri - Cluster computing, 2022 - Springer
Deep learning-based video anomaly detection methods have drawn significant attention in
the past few years due to their superior performance. However, almost all the leading …

Video-based cross-modal auxiliary network for multimodal sentiment analysis

R Chen, W Zhou, Y Li, H Zhou - IEEE Transactions on Circuits …, 2022 - ieeexplore.ieee.org
Multimodal sentiment analysis has a wide range of applications due to its information
complementarity in multimodal interactions. Previous works focus more on investigating …

Steps: Self-supervised key step extraction and localization from unlabeled procedural videos

A Shah, B Lundell, H Sawhney… - Proceedings of the …, 2023 - openaccess.thecvf.com
We address the problem of extracting key steps from unlabeled procedural videos,
motivated by the potential of Augmented Reality (AR) headsets to revolutionize job training …

Mhms: Multimodal hierarchical multimedia summarization

J Qiu, J Zhu, M Xu, F Dernoncourt, T Bui… - arxiv preprint arxiv …, 2022 - arxiv.org
Multimedia summarization with multimodal output can play an essential role in real-world
applications, ie, automatically generating cover images and titles for news articles or …

Enhancing video anomaly detection using a transformer spatiotemporal attention unsupervised framework for large datasets

MH Habeb, M Salama, LA Elrefaei - Algorithms, 2024 - mdpi.com
This work introduces an unsupervised framework for video anomaly detection, leveraging a
hybrid deep learning model that combines a vision transformer (ViT) with a convolutional …

Unsupervised video summarization using deep Non-Local video summarization networks

SS Zang, H Yu, Y Song, R Zeng - Neurocomputing, 2023 - Elsevier
Video summarization is to extract effective information from videos to quickly obtain the most
informative summary. Most of the existing video summarization methods use recurrent …

Exploring video frame redundancies for efficient data sampling and annotation in instance segmentation

J Yoon, MK Choi - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
In recent years, deep neural network architectures and learning algorithms have greatly
improved the performance of computer vision tasks. However, acquiring and annotating …

Adopting Self-Supervised Learning into Unsupervised Video Summarization through Restorative Score.

M Abbasi, P Saeedi - 2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
In this paper, we present a new process for creating video summaries in an unsupervised
manner. Our approach involves training a transformer encoder model to reconstruct missing …

Conditional deep clustering based transformed spatio-temporal features and fused distance for efficient video retrieval

A Banerjee, E Kumar, M Ravinder - International Journal of Information …, 2023 - Springer
Key frame extraction is essential for video retrieval because it reduces the quantity of data
needed to be processed. However, current video comparison methods classify videos by …

SCCS: Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment

J Qiu, J Zhu, M Xu, F Dernoncourt, T Bui… - Findings of the …, 2023 - aclanthology.org
Multimedia summarization with multimodal output (MSMO) is a recently explored application
in language grounding. It plays an essential role in real-world applications, ie, automatically …