A survey of video-based human action recognition in team sports

H Yin, RO Sinnott, GT Jayaputera - Artificial Intelligence Review, 2024 - Springer
Over the past few decades, numerous studies have focused on identifying and recognizing
human actions using machine learning and computer vision techniques. Video-based …

Medical video generation for disease progression simulation

X Cao, K Liang, KD Liao, T Gao, W Ye, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Modeling disease progression is crucial for improving the quality and efficacy of clinical
diagnosis and prognosis, but it is often hindered by a lack of longitudinal medical image …

VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation

M Zheng, Y Xu, H Huang, X Ma, Y Liu, W Shu… - arxiv preprint arxiv …, 2024 - arxiv.org
Current video generation models excel at generating short clips but still struggle with
creating multi-shot, movie-like videos. Existing models trained on large-scale data on the …

Video-to-Audio Generation with Fine-grained Temporal Semantics

Y Hu, Y Gu, C Li, R Chen, D Yu - arxiv preprint arxiv:2409.14709, 2024 - arxiv.org
With recent advances of AIGC, video generation have gained a surge of research interest in
both academia and industry (eg, Sora). However, it remains a challenge to produce …

A Survey of Sustainability in Large Language Models: Applications, Economics, and Challenges

A Singh, NP Patel, A Ehtesham, S Kumar… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have transformed numerous domains by providing
advanced capabilities in natural language understanding, generation, and reasoning …

Memory-enhanced hierarchical transformer for video paragraph captioning

B Zhang, J Gao, Y Yuan - Neurocomputing, 2025 - Elsevier
Video paragraph captioning aims to describe a video that contains multiple events with a
paragraph of generated coherent sentences. Such a captioning task is full of challenges …

Dual Variational Knowledge Attention for Class Incremental Vision Transformer

H Duan, R Sun, V Ojha, T Shah… - … Joint Conference on …, 2024 - ieeexplore.ieee.org
Class incremental learning (CIL) strives to emulate the human cognitive process of
continuously learning and adapting to new tasks while retaining knowledge from past …

Adversarial Attacks Against Shared Knowledge Interpretation in Semantic Communications

VT Hoang, VL Nguyen, RG Chang… - IEEE Transactions …, 2025 - ieeexplore.ieee.org
Semantic communications (SEMCOM) is a novel communication model that exploits neural
networks or deep learning techniques to convey the semantics of the data and contextual …

Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop

Z Qian, A Sharifi, T Carroll, SN Lim - arxiv preprint arxiv:2411.18644, 2024 - arxiv.org
Video generation has achieved impressive quality, but it still suffers from artifacts such as
temporal inconsistency and violation of physical laws. Leveraging 3D scenes can …

Seeing World Dynamics in a Nutshell

Q Shen, X Yi, M Lin, H Zhang, S Yan… - arxiv preprint arxiv …, 2025 - arxiv.org
We consider the problem of efficiently representing casually captured monocular videos in a
spatially-and temporally-coherent manner. While existing approaches predominantly rely on …