Vltint: Visual-linguistic transformer-in-transformer for coherent video paragraph captioning
Abstract Video Paragraph Captioning aims to generate a multi-sentence description of an
untrimmed video with multiple temporal event locations in a coherent storytelling. Following …
untrimmed video with multiple temporal event locations in a coherent storytelling. Following …
Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection
Video anomaly detection (VAD)–commonly formulated as a multiple-instance learning
problem in a weakly-supervised manner due to its labor-intensive nature–is a challenging …
problem in a weakly-supervised manner due to its labor-intensive nature–is a challenging …
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection
Temporal action detection (TAD) involves the localization and classification of action
instances within untrimmed videos. While standard TAD follows fully supervised learning …
instances within untrimmed videos. While standard TAD follows fully supervised learning …
SolarFormer: Multi-scale transformer for solar PV profiling
As climate change intensifies, the global imperative to shift towards sustainable energy
sources becomes more pronounced. Photovoltaic (PV) energy is a favored choice due to its …
sources becomes more pronounced. Photovoltaic (PV) energy is a favored choice due to its …
S3Former: A Deep Learning Approach to High Resolution Solar PV Profiling
As the negative impact of climate change escalates, the global necessity to transition to
sustainable energy sources becomes increasingly evident. Renewable energies have …
sustainable energy sources becomes increasingly evident. Renewable energies have …
S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling
As the impact of climate change escalates, the global necessity to transition to sustainable
energy sources becomes increasingly evident. Renewable energies have emerged as a …
energy sources becomes increasingly evident. Renewable energies have emerged as a …
Understanding Digital Literacy Using Film: Case Study of Netflix's The Social Dilemma
APP Putri - Jurnal JTIK (Jurnal Teknologi Informasi dan …, 2024 - journal.lembagakita.org
Social media may have been glorified as something helpful, easy and convenient but as we
all know it must have two sides of a coin. Social media does have quite a lot of bad sides …
all know it must have two sides of a coin. Social media does have quite a lot of bad sides …
Towards Multi-modal Explainable Video Understanding
K Yamazaki - 2023 - search.proquest.com
This thesis presents a novel approach to video understanding by emulating human
perceptual processes and creating an explainable and coherent storytelling representation …
perceptual processes and creating an explainable and coherent storytelling representation …
Dense Video Captioning Based on Memory Enhanced Attention and Guided Learning
K Liang, X Cai, S Long, Y Huang - … International Conference on …, 2023 - ieeexplore.ieee.org
In the task of captioning multiple events in video content, traditional models using self-
attention mechanisms often suffer from the problem of missing fine-grained visual semantic …
attention mechanisms often suffer from the problem of missing fine-grained visual semantic …
Deep Learning for Photovoltaic Characterization
AM de Luis Garcia - 2023 - search.proquest.com
This thesis introduces a novel approach to Photovoltaic (PV) installation segmentation by
proposing a new architecture to understand and identify PV modules from overhead …
proposing a new architecture to understand and identify PV modules from overhead …