An explainable and efficient deep learning framework for video anomaly detection
Deep learning-based video anomaly detection methods have drawn significant attention in
the past few years due to their superior performance. However, almost all the leading …
the past few years due to their superior performance. However, almost all the leading …
Video-based cross-modal auxiliary network for multimodal sentiment analysis
Multimodal sentiment analysis has a wide range of applications due to its information
complementarity in multimodal interactions. Previous works focus more on investigating …
complementarity in multimodal interactions. Previous works focus more on investigating …
Steps: Self-supervised key step extraction and localization from unlabeled procedural videos
A Shah, B Lundell, H Sawhney… - Proceedings of the …, 2023 - openaccess.thecvf.com
We address the problem of extracting key steps from unlabeled procedural videos,
motivated by the potential of Augmented Reality (AR) headsets to revolutionize job training …
motivated by the potential of Augmented Reality (AR) headsets to revolutionize job training …
Mhms: Multimodal hierarchical multimedia summarization
Multimedia summarization with multimodal output can play an essential role in real-world
applications, ie, automatically generating cover images and titles for news articles or …
applications, ie, automatically generating cover images and titles for news articles or …
Enhancing video anomaly detection using a transformer spatiotemporal attention unsupervised framework for large datasets
This work introduces an unsupervised framework for video anomaly detection, leveraging a
hybrid deep learning model that combines a vision transformer (ViT) with a convolutional …
hybrid deep learning model that combines a vision transformer (ViT) with a convolutional …
Unsupervised video summarization using deep Non-Local video summarization networks
Video summarization is to extract effective information from videos to quickly obtain the most
informative summary. Most of the existing video summarization methods use recurrent …
informative summary. Most of the existing video summarization methods use recurrent …
Exploring video frame redundancies for efficient data sampling and annotation in instance segmentation
In recent years, deep neural network architectures and learning algorithms have greatly
improved the performance of computer vision tasks. However, acquiring and annotating …
improved the performance of computer vision tasks. However, acquiring and annotating …
Adopting Self-Supervised Learning into Unsupervised Video Summarization through Restorative Score.
In this paper, we present a new process for creating video summaries in an unsupervised
manner. Our approach involves training a transformer encoder model to reconstruct missing …
manner. Our approach involves training a transformer encoder model to reconstruct missing …
Conditional deep clustering based transformed spatio-temporal features and fused distance for efficient video retrieval
Key frame extraction is essential for video retrieval because it reduces the quantity of data
needed to be processed. However, current video comparison methods classify videos by …
needed to be processed. However, current video comparison methods classify videos by …
SCCS: Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment
Multimedia summarization with multimodal output (MSMO) is a recently explored application
in language grounding. It plays an essential role in real-world applications, ie, automatically …
in language grounding. It plays an essential role in real-world applications, ie, automatically …