Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery

S Rao, S Mahajan, M Böhle, B Schiele - European Conference on …, 2024 - Springer
Abstract Concept Bottleneck Models (CBMs) have recently been proposed to address the
'black-box'problem of deep neural networks, by first map** images to a human …

A holistic approach to unifying automatic concept extraction and concept importance estimation

T Fel, V Boutin, L Béthune, R Cadène… - Advances in …, 2024 - proceedings.neurips.cc
In recent years, concept-based approaches have emerged as some of the most promising
explainability methods to help us interpret the decisions of Artificial Neural Networks (ANNs) …

Disentangling neuron representations with concept vectors

L O'Mahony, V Andrearczyk… - Proceedings of the …, 2023 - openaccess.thecvf.com
Mechanistic interpretability aims to understand how models store representations by
breaking down neural networks into interpretable units. However, the occurrence of …

Interpretability is in the mind of the beholder: A causal framework for human-interpretable representation learning

E Marconato, A Passerini, S Teso - Entropy, 2023 - mdpi.com
Research on Explainable Artificial Intelligence has recently started exploring the idea of
producing explanations that, rather than being expressed in terms of low-level features, are …

Towards Utilising a Range of Neural Activations for Comprehending Representational Associations

L O'Mahony, NS Nikolov, DJP O'Sullivan - arxiv preprint arxiv:2411.10019, 2024 - arxiv.org
Recent efforts to understand intermediate representations in deep neural networks have
commonly attempted to label individual neurons and combinations of neurons that make up …

Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions

H Fokkema, T van Erven, S Magliacane - arxiv preprint arxiv:2502.06536, 2025 - arxiv.org
Machine learning is a vital part of many real-world systems, but several concerns remain
about the lack of interpretability, explainability and robustness of black-box AI systems …

Extracting Concepts From Neural Networks Using Conceptor-based Clustering

J Peters - 2024 - fse.studenttheses.ub.rug.nl
Conceptors are versatile neuro-symbolic formalizations of concepts as they arise in neural
networks, with promising results on supervised tasks. However, the use of conceptors in …