Explainable and interpretable multimodal large language models: A comprehensive survey
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with
large language models (LLMs) and computer vision (CV) systems driving advancements in …
large language models (LLMs) and computer vision (CV) systems driving advancements in …
Towards universality: Studying mechanistic similarity across language model architectures
The hypothesis of Universality in interpretability suggests that different neural networks may
converge to implement similar algorithms on similar tasks. In this work, we investigate two …
converge to implement similar algorithms on similar tasks. In this work, we investigate two …
Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks
A key objective of interpretability research on large language models (LLMs) is to develop
methods for robustly steering models toward desired behaviors. To this end, two distinct …
methods for robustly steering models toward desired behaviors. To this end, two distinct …
LLMs can see and hear without any training
We present MILS: Multimodal Iterative LLM Solver, a surprisingly simple, training-free
approach, to imbue multimodal capabilities into your favorite LLM. Leveraging their innate …
approach, to imbue multimodal capabilities into your favorite LLM. Leveraging their innate …
GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
É Zablocki, V Gerard, A Cardiel, E Gaussier… - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding deep models is crucial for deploying them in safety-critical applications. We
introduce GIFT, a framework for deriving post-hoc, global, interpretable, and faithful textual …
introduce GIFT, a framework for deriving post-hoc, global, interpretable, and faithful textual …