Explainable and interpretable multimodal large language models: A comprehensive survey
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with
large language models (LLMs) and computer vision (CV) systems driving advancements in …
large language models (LLMs) and computer vision (CV) systems driving advancements in …
VidHal: Benchmarking Temporal Hallucinations in Vision LLMs
Vision Large Language Models (VLLMs) are widely acknowledged to be prone to
hallucination. Existing research addressing this problem has primarily been confined to …
hallucination. Existing research addressing this problem has primarily been confined to …
OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions
The lack of occlusion data in commonly used action recognition video datasets limits model
robustness and impedes sustained performance improvements. We construct OccludeNet, a …
robustness and impedes sustained performance improvements. We construct OccludeNet, a …