Exploring the potential of large language models for improving digital forensic investigation efficiency

A Wickramasekara, F Breitinger, M Scanlon - Forensic Science International …, 2025 - Elsevier
The ever-increasing workload of digital forensic labs raises concerns about law
enforcement's ability to conduct both cyber-related and non-cyber-related investigations …

Thinking in space: How multimodal large language models see, remember, and recall spaces

J Yang, S Yang, AW Gupta, R Han, L Fei-Fei… - arxiv preprint arxiv …, 2024 - arxiv.org
Humans possess the visual-spatial intelligence to remember spaces from sequential visual
observations. However, can Multimodal Large Language Models (MLLMs) trained on million …

VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs

R Liao, M Erler, H Wang, G Zhai, G Zhang, Y Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
In the video-language domain, recent works in leveraging zero-shot Large Language Model-
based reasoning for video understanding have become competitive challengers to previous …

Arigraph: Learning knowledge graph world models with episodic memory for llm agents

P Anokhin, N Semenov, A Sorokin, D Evseev… - arxiv preprint arxiv …, 2024 - arxiv.org
Advancements in generative AI have broadened the potential applications of Large
Language Models (LLMs) in the development of autonomous agents. Achieving true …

Ing-vp: Mllms cannot play easy vision-based games yet

H Zhang, H Guo, S Guo, M Cao, W Huang, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
As multimodal large language models (MLLMs) continue to demonstrate increasingly
competitive performance across a broad spectrum of tasks, more intricate and …

Sparkle: Mastering basic spatial capabilities in vision language models elicits generalization to composite spatial reasoning

Y Tang, A Qu, Z Wang, D Zhuang, Z Wu, W Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
Vision language models (VLMs) have demonstrated impressive performance across a wide
range of downstream tasks. However, their proficiency in spatial reasoning remains limited …

Evaluating the ability of large language models to reason about Cardinal directions

AG Cohn, RE Blackwell - arxiv preprint arxiv:2406.16528, 2024 - arxiv.org
We investigate the abilities of a representative set of Large language Models (LLMs) to
reason about cardinal directions (CDs). To do so, we create two datasets: the first, co …

[HTML][HTML] From play to understanding: Large language models in logic and spatial reasoning coloring activities for children

S Tapia-Mandiola, R Araya - AI, 2024 - mdpi.com
Visual thinking leverages spatial mechanisms in animals for navigation and reasoning.
Therefore, given the challenge of abstract mathematics and logic, spatial reasoning-based …

Visual Perception in Text Strings

Q Jia, X Yue, S Huang, Z Qin, Y Liu, BY Lin… - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding visual semantics embedded in consecutive characters is a crucial capability
for both large language models (LLMs) and multi-modal large language models (MLLMs) …

The Visualization JUDGE: Can Multimodal Foundation Models Guide Visualization Design Through Visual Perception?

M Berger, S Liu - 2024 IEEE Evaluation and Beyond …, 2024 - ieeexplore.ieee.org
Foundation models for vision and language are the basis of AI applications across
numerous sectors of society. The success of these models stems from their ability to mimic …