A review of deep learning for video captioning
Video captioning (VC) is a fast-moving, cross-disciplinary area of research that comprises
contributions from domains such as computer vision, natural language processing …
contributions from domains such as computer vision, natural language processing …
Survey on videos data augmentation for deep learning models
In most Computer Vision applications, Deep Learning models achieve state-of-the-art
performances. One drawback of Deep Learning is the large amount of data needed to train …
performances. One drawback of Deep Learning is the large amount of data needed to train …
Movie recommendation system using sentiment analysis from microblogging data
Recommendation systems (RSs) have garnered immense interest for applications in e-
commerce and digital media. Traditional approaches in RSs include such as collaborative …
commerce and digital media. Traditional approaches in RSs include such as collaborative …
A feature-space multimodal data augmentation technique for text-video retrieval
Every hour, huge amounts of visual contents are posted on social media and user-
generated content platforms. To find relevant videos by means of a natural language query …
generated content platforms. To find relevant videos by means of a natural language query …
HKGCL: Hierarchical graph contrastive learning for multi-domain recommendation over knowledge graph
Multi-domain recommendation (MDR) aims to improve the recommendation performance in
all target domains simultaneously by leveraging rich data from relevant domains. However …
all target domains simultaneously by leveraging rich data from relevant domains. However …
A hybrid approach based on gan and cnn-lstm for aerial activity recognition
A Bousmina, M Selmi, MA Ben Rhaiem, IR Farah - Remote Sensing, 2023 - mdpi.com
Unmanned aerial vehicles (UAVs), known as drones, have played a significant role in recent
years in creating resilient smart cities. UAVs can be used for a wide range of applications …
years in creating resilient smart cities. UAVs can be used for a wide range of applications …
[PDF][PDF] Renmin University of China and Zhejiang Gongshang University at TRECVID 2018: Deep Cross-Modal Embeddings for Video-Text Retrieval.
In this paper we summarize our TRECVID 2018 [1] video retrieval experiments. We
participated in two tasks: Ad-hoc Video Search (AVS) and Video-to-Text (VTT) Matching and …
participated in two tasks: Ad-hoc Video Search (AVS) and Video-to-Text (VTT) Matching and …
Let All be Whitened: Multi-teacher Distillation for Efficient Visual Retrieval
Visual retrieval aims to search for the most relevant visual items, eg, images and videos,
from a candidate gallery with a given query item. Accuracy and efficiency are two competing …
from a candidate gallery with a given query item. Accuracy and efficiency are two competing …
Multi-trends enhanced dynamic micro-video recommendation
The explosively generated micro-videos on content sharing platforms call for recommender
systems to permit personalized micro-video discovery with ease. Recent advances in micro …
systems to permit personalized micro-video discovery with ease. Recent advances in micro …
Feature re-learning with data augmentation for video relevance prediction
Predicting the relevance between two given videos with respect to their visual content is a
key component for content-based video recommendation and retrieval. Thanks to the …
key component for content-based video recommendation and retrieval. Thanks to the …