Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models

X Xu, T Niu, Y **e, L Qin, W Che, MY Kan - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) excel in vision--language tasks by pre-
training solely on coarse-grained concept annotations (eg, image captions). We hypothesize …

ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model

Q Chen, L Qin, J Liu, D Peng, J Wang, M Hu… - arxiv preprint arxiv …, 2025 - arxiv.org
Recent advancements in large language models (LLMs) have led to significant successes
across various applications, where the most noticeable is to a series of emerging …

Analyzing Fine-tuning Representation Shift for Multimodal LLMs Steering alignment

P Khayatan, M Shukor, J Parekh, M Cord - arxiv preprint arxiv:2501.03012, 2025 - arxiv.org
Multimodal LLMs have reached remarkable levels of proficiency in understanding
multimodal inputs, driving extensive research to develop increasingly powerful models …