Exploring the potential of large language models for improving digital forensic investigation efficiency
The ever-increasing workload of digital forensic labs raises concerns about law
enforcement's ability to conduct both cyber-related and non-cyber-related investigations …
enforcement's ability to conduct both cyber-related and non-cyber-related investigations …
Thinking in space: How multimodal large language models see, remember, and recall spaces
Humans possess the visual-spatial intelligence to remember spaces from sequential visual
observations. However, can Multimodal Large Language Models (MLLMs) trained on million …
observations. However, can Multimodal Large Language Models (MLLMs) trained on million …
VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
In the video-language domain, recent works in leveraging zero-shot Large Language Model-
based reasoning for video understanding have become competitive challengers to previous …
based reasoning for video understanding have become competitive challengers to previous …
Arigraph: Learning knowledge graph world models with episodic memory for llm agents
Advancements in generative AI have broadened the potential applications of Large
Language Models (LLMs) in the development of autonomous agents. Achieving true …
Language Models (LLMs) in the development of autonomous agents. Achieving true …
Ing-vp: Mllms cannot play easy vision-based games yet
As multimodal large language models (MLLMs) continue to demonstrate increasingly
competitive performance across a broad spectrum of tasks, more intricate and …
competitive performance across a broad spectrum of tasks, more intricate and …
Sparkle: Mastering basic spatial capabilities in vision language models elicits generalization to composite spatial reasoning
Vision language models (VLMs) have demonstrated impressive performance across a wide
range of downstream tasks. However, their proficiency in spatial reasoning remains limited …
range of downstream tasks. However, their proficiency in spatial reasoning remains limited …
Evaluating the ability of large language models to reason about Cardinal directions
We investigate the abilities of a representative set of Large language Models (LLMs) to
reason about cardinal directions (CDs). To do so, we create two datasets: the first, co …
reason about cardinal directions (CDs). To do so, we create two datasets: the first, co …
[HTML][HTML] From play to understanding: Large language models in logic and spatial reasoning coloring activities for children
S Tapia-Mandiola, R Araya - AI, 2024 - mdpi.com
Visual thinking leverages spatial mechanisms in animals for navigation and reasoning.
Therefore, given the challenge of abstract mathematics and logic, spatial reasoning-based …
Therefore, given the challenge of abstract mathematics and logic, spatial reasoning-based …
Visual Perception in Text Strings
Understanding visual semantics embedded in consecutive characters is a crucial capability
for both large language models (LLMs) and multi-modal large language models (MLLMs) …
for both large language models (LLMs) and multi-modal large language models (MLLMs) …
The Visualization JUDGE: Can Multimodal Foundation Models Guide Visualization Design Through Visual Perception?
M Berger, S Liu - 2024 IEEE Evaluation and Beyond …, 2024 - ieeexplore.ieee.org
Foundation models for vision and language are the basis of AI applications across
numerous sectors of society. The success of these models stems from their ability to mimic …
numerous sectors of society. The success of these models stems from their ability to mimic …