A review of deep learning for video captioning

M Abdar, M Kollati, S Kuraparthi… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Video captioning (VC) is a fast-moving, cross-disciplinary area of research that comprises
contributions from domains such as computer vision, natural language processing …

Video captioning with aggregated features based on dual graphs and gated fusion

Y **, B Liu, J Wang - arxiv preprint arxiv:2308.06685, 2023 - arxiv.org
The application of video captioning models aims at translating the content of videos by using
accurate natural language. Due to the complex nature inbetween object interaction in the …

A Study of Multimodal Colearning, Application in Biometrics and Authentication

S Avasthi, T Sanwal, A Prakash… - … Biometric and Machine …, 2023 - Wiley Online Library
Summary “Multimodality” refers to utilizing multiple communication methods to comprehend
our environment better and enhance the user's experience. Using multimodal data, we may …