A review of deep learning for video captioning

M Abdar, M Kollati, S Kuraparthi… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Video captioning (VC) is a fast-moving, cross-disciplinary area of research that comprises
contributions from domains such as computer vision, natural language processing …

Survey on videos data augmentation for deep learning models

N Cauli, D Reforgiato Recupero - Future Internet, 2022 - mdpi.com
In most Computer Vision applications, Deep Learning models achieve state-of-the-art
performances. One drawback of Deep Learning is the large amount of data needed to train …

Movie recommendation system using sentiment analysis from microblogging data

S Kumar, K De, PP Roy - IEEE Transactions on Computational …, 2020 - ieeexplore.ieee.org
Recommendation systems (RSs) have garnered immense interest for applications in e-
commerce and digital media. Traditional approaches in RSs include such as collaborative …

A feature-space multimodal data augmentation technique for text-video retrieval

A Falcon, G Serra, O Lanz - Proceedings of the 30th ACM International …, 2022 - dl.acm.org
Every hour, huge amounts of visual contents are posted on social media and user-
generated content platforms. To find relevant videos by means of a natural language query …

HKGCL: Hierarchical graph contrastive learning for multi-domain recommendation over knowledge graph

Y Li, L Hou, D Li, J Li - Expert Systems with Applications, 2023 - Elsevier
Multi-domain recommendation (MDR) aims to improve the recommendation performance in
all target domains simultaneously by leveraging rich data from relevant domains. However …

A hybrid approach based on gan and cnn-lstm for aerial activity recognition

A Bousmina, M Selmi, MA Ben Rhaiem, IR Farah - Remote Sensing, 2023 - mdpi.com
Unmanned aerial vehicles (UAVs), known as drones, have played a significant role in recent
years in creating resilient smart cities. UAVs can be used for a wide range of applications …

[PDF][PDF] Renmin University of China and Zhejiang Gongshang University at TRECVID 2018: Deep Cross-Modal Embeddings for Video-Text Retrieval.

X Li, J Dong, C Xu, J Cao, X Wang, G Yang - TRECVID, 2018 - lixirong.net
In this paper we summarize our TRECVID 2018 [1] video retrieval experiments. We
participated in two tasks: Ad-hoc Video Search (AVS) and Video-to-Text (VTT) Matching and …

Let All be Whitened: Multi-teacher Distillation for Efficient Visual Retrieval

Z Ma, J Dong, S Ji, Z Liu, X Zhang, Z Wang… - Proceedings of the …, 2024 - ojs.aaai.org
Visual retrieval aims to search for the most relevant visual items, eg, images and videos,
from a candidate gallery with a given query item. Accuracy and efficiency are two competing …

Multi-trends enhanced dynamic micro-video recommendation

Y Lu, Y Huang, S Zhang, W Han, H Chen… - … Conference on Artificial …, 2023 - Springer
The explosively generated micro-videos on content sharing platforms call for recommender
systems to permit personalized micro-video discovery with ease. Recent advances in micro …

Feature re-learning with data augmentation for video relevance prediction

J Dong, X Wang, L Zhang, C Xu… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Predicting the relevance between two given videos with respect to their visual content is a
key component for content-based video recommendation and retrieval. Thanks to the …