Knowledge mechanisms in large language models: A survey and perspective

M Wang, Y Yao, Z Xu, S Qiao, S Deng, P Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Understanding knowledge mechanisms in Large Language Models (LLMs) is crucial for
advancing towards trustworthy AGI. This paper reviews knowledge mechanism analysis …

How do large language models handle multilingualism?

Y Zhao, W Zhang, G Chen… - Advances in Neural …, 2025 - proceedings.neurips.cc
Large language models (LLMs) have demonstrated impressive capabilities across diverse
languages. This study explores how LLMs handle multilingualism. Based on observed …

Explainable and interpretable multimodal large language models: A comprehensive survey

Y Dang, K Huang, J Huo, Y Yan, S Huang, D Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with
large language models (LLMs) and computer vision (CV) systems driving advancements in …

Talking heads: Understanding inter-layer communication in transformer language models

J Merullo, C Eickhoff, E Pavlick - Advances in Neural …, 2025 - proceedings.neurips.cc
Although it is known that transformer language models (LMs) pass features from early layers
to later layers, it is not well understood how this information is represented and routed by the …

Mmneuron: Discovering neuron-level domain-specific interpretation in multimodal large language model

J Huo, Y Yan, B Hu, Y Yue, X Hu - arxiv preprint arxiv:2406.11193, 2024 - arxiv.org
Projecting visual features into word embedding space has become a significant fusion
strategy adopted by Multimodal Large Language Models (MLLMs). However, its internal …

Configurable foundation models: Building llms from a modular perspective

C **ao, Z Zhang, C Song, D Jiang, F Yao, X Han… - arxiv preprint arxiv …, 2024 - arxiv.org
Advancements in LLMs have recently unveiled challenges tied to computational efficiency
and continual scalability due to their requirements of huge parameters, making the …

Miner: Mining the underlying pattern of modality-specific neurons in multimodal large language models

K Huang, J Huo, Y Yan, K Wang, Y Yue… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, multimodal large language models (MLLMs) have significantly advanced,
integrating more modalities into diverse applications. However, the lack of explainability …

Knowledge localization: Mission not accomplished? enter query localization!

Y Chen, P Cao, Y Chen, K Liu, J Zhao - arxiv preprint arxiv:2405.14117, 2024 - arxiv.org
Large language models (LLMs) store extensive factual knowledge, but the mechanisms
behind how they store and express this knowledge remain unclear. The Knowledge Neuron …

Llama scope: Extracting millions of features from llama-3.1-8b with sparse autoencoders

Z He, W Shu, X Ge, L Chen, J Wang, Y Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
Sparse Autoencoders (SAEs) have emerged as a powerful unsupervised method for
extracting sparse representations from language models, yet scalable training remains a …

Style-specific neurons for steering llms in text style transfer

W Lai, V Hangya, A Fraser - arxiv preprint arxiv:2410.00593, 2024 - arxiv.org
Text style transfer (TST) aims to modify the style of a text without altering its original
meaning. Large language models (LLMs) demonstrate superior performance across …