Vltint: Visual-linguistic transformer-in-transformer for coherent video paragraph captioning

K Yamazaki, K Vo, QS Truong, B Raj… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Abstract Video Paragraph Captioning aims to generate a multi-sentence description of an
untrimmed video with multiple temporal event locations in a coherent storytelling. Following …

Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection

HK Joo, K Vo, K Yamazaki, N Le - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Video anomaly detection (VAD)–commonly formulated as a multiple-instance learning
problem in a weakly-supervised manner due to its labor-intensive nature–is a challenging …

ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection

T Phan, K Vo, D Le, G Doretto… - Proceedings of the …, 2024 - openaccess.thecvf.com
Temporal action detection (TAD) involves the localization and classification of action
instances within untrimmed videos. While standard TAD follows fully supervised learning …

SolarFormer: Multi-scale transformer for solar PV profiling

A De Luis, M Tran, T Hanyu, A Tran… - … on Smart Grid …, 2024 - ieeexplore.ieee.org
As climate change intensifies, the global imperative to shift towards sustainable energy
sources becomes more pronounced. Photovoltaic (PV) energy is a favored choice due to its …

S3Former: A Deep Learning Approach to High Resolution Solar PV Profiling

M Tran, A De Luis, H Liao, Y Huang… - … on Smart Grid, 2025 - ieeexplore.ieee.org
As the negative impact of climate change escalates, the global necessity to transition to
sustainable energy sources becomes increasingly evident. Renewable energies have …

S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

M Tran, A De Luis, H Liao, Y Huang, R McCann… - arxiv preprint arxiv …, 2024 - arxiv.org
As the impact of climate change escalates, the global necessity to transition to sustainable
energy sources becomes increasingly evident. Renewable energies have emerged as a …

Understanding Digital Literacy Using Film: Case Study of Netflix's The Social Dilemma

APP Putri - Jurnal JTIK (Jurnal Teknologi Informasi dan …, 2024 - journal.lembagakita.org
Social media may have been glorified as something helpful, easy and convenient but as we
all know it must have two sides of a coin. Social media does have quite a lot of bad sides …

Towards Multi-modal Explainable Video Understanding

K Yamazaki - 2023 - search.proquest.com
This thesis presents a novel approach to video understanding by emulating human
perceptual processes and creating an explainable and coherent storytelling representation …

Dense Video Captioning Based on Memory Enhanced Attention and Guided Learning

K Liang, X Cai, S Long, Y Huang - … International Conference on …, 2023 - ieeexplore.ieee.org
In the task of captioning multiple events in video content, traditional models using self-
attention mechanisms often suffer from the problem of missing fine-grained visual semantic …

Deep Learning for Photovoltaic Characterization

AM de Luis Garcia - 2023 - search.proquest.com
This thesis introduces a novel approach to Photovoltaic (PV) installation segmentation by
proposing a new architecture to understand and identify PV modules from overhead …