From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai

M Nauta, J Trienes, S Pathak, E Nguyen… - ACM Computing …, 2023 - dl.acm.org
The rising popularity of explainable artificial intelligence (XAI) to understand high-performing
black boxes raised the question of how to evaluate explanations of machine learning (ML) …

On the explainability of natural language processing deep models

JE Zini, M Awad - ACM Computing Surveys, 2022 - dl.acm.org
Despite their success, deep networks are used as black-box models with outputs that are not
easily explainable during the learning and the prediction phases. This lack of interpretability …

Decoding semantic representations in mind and brain

SL Frisby, AD Halai, CR Cox, MAL Ralph… - Trends in cognitive …, 2023 - cell.com
A key goal for cognitive neuroscience is to understand the neurocognitive systems that
support semantic memory. Recent multivariate analyses of neuroimaging data have …

Interpreting deep learning models in natural language processing: A review

X Sun, D Yang, X Li, T Zhang, Y Meng, H Qiu… - arxiv preprint arxiv …, 2021 - arxiv.org
Neural network models have achieved state-of-the-art performances in a wide range of
natural language processing (NLP) tasks. However, a long-standing criticism against neural …

Word embeddings are steers for language models

C Han, J Xu, M Li, Y Fung, C Sun, N Jiang… - Proceedings of the …, 2024 - aclanthology.org
Abstract Language models (LMs) automatically learn word embeddings during pre-training
on language corpora. Although word embeddings are usually interpreted as feature vectors …

VICE: Variational interpretable concept embeddings

L Muttenthaler, CY Zheng, P McClure… - Advances in …, 2022 - proceedings.neurips.cc
A central goal in the cognitive sciences is the development of numerical models for mental
representations of object concepts. This paper introduces Variational Interpretable Concept …

Learning interpretable word embeddings via bidirectional alignment of dimensions with semantic concepts

LK Şenel, F Şahinuç, V Yücesoy, H Schütze… - Information Processing …, 2022 - Elsevier
We propose bidirectional imparting or BiImp, a generalized method for aligning embedding
dimensions with concepts during the embedding learning phase. While preserving the …

[PDF][PDF] Lm-switch: Lightweight language model conditioning in word embedding space

C Han, J Xu, M Li, Y Fung, C Sun… - arxiv preprint arxiv …, 2023 - blender.cs.illinois.edu
In recent years, large language models (LMs) have achieved remarkable progress across
various natural language processing tasks. As pre-training and fine-tuning are costly and …

A method for constructing word sense embeddings based on word sense induction

Y Sun, J Platoš - Scientific Reports, 2023 - nature.com
Polysemy is an inherent characteristic of natural language. In order to make it easier to
distinguish between different senses of polysemous words, we propose a method for …

Neural variational sparse topic model for sparse explainable text representation

Q **e, P Tiwari, D Gupta, J Huang, M Peng - Information Processing & …, 2021 - Elsevier
Texts are the major information carrier for internet users, from which learning the latent
representations has important research and practical value. Neural topic models have been …