[HTML][HTML] Transparency of deep neural networks for medical image analysis: A review of interpretability methods

Z Salahuddin, HC Woodruff, A Chatterjee… - Computers in biology and …, 2022 - Elsevier
Artificial Intelligence (AI) has emerged as a useful aid in numerous clinical applications for
diagnosis and treatment decisions. Deep neural networks have shown the same or better …

[HTML][HTML] Notions of explainability and evaluation approaches for explainable artificial intelligence

G Vilone, L Longo - Information Fusion, 2021 - Elsevier
Abstract Explainable Artificial Intelligence (XAI) has experienced a significant growth over
the last few years. This is due to the widespread application of machine learning, particularly …

From attribution maps to human-understandable explanations through concept relevance propagation

R Achtibat, M Dreyer, I Eisenbraun, S Bosse… - Nature Machine …, 2023 - nature.com
The field of explainable artificial intelligence (XAI) aims to bring transparency to today's
powerful but opaque deep learning models. While local XAI methods explain individual …

Representation engineering: A top-down approach to ai transparency

A Zou, L Phan, S Chen, J Campbell, P Guo… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we identify and characterize the emerging area of representation engineering
(RepE), an approach to enhancing the transparency of AI systems that draws on insights …

Multimodal datasets: misogyny, pornography, and malignant stereotypes

A Birhane, VU Prabhu, E Kahembwe - arxiv preprint arxiv:2110.01963, 2021 - arxiv.org
We have now entered the era of trillion parameter machine learning models trained on
billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has …

A survey on neural network interpretability

Y Zhang, P Tiňo, A Leonardis… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Along with the great success of deep neural networks, there is also growing concern about
their black-box nature. The interpretability issue affects people's trust on deep learning …

Transformer interpretability beyond attention visualization

H Chefer, S Gur, L Wolf - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Self-attention techniques, and specifically Transformers, are dominating the field of text
processing and are becoming increasingly popular in computer vision classification tasks. In …

Parameterized explainer for graph neural network

D Luo, W Cheng, D Xu, W Yu, B Zong… - Advances in neural …, 2020 - proceedings.neurips.cc
Despite recent progress in Graph Neural Networks (GNNs), explaining predictions made by
GNNs remains a challenging open problem. The leading method mainly addresses the local …

Reconstructing training data from trained neural networks

N Haim, G Vardi, G Yehudai… - Advances in Neural …, 2022 - proceedings.neurips.cc
Understanding to what extent neural networks memorize training data is an intriguing
question with practical and theoretical implications. In this paper we show that in some …

Transformer feed-forward layers are key-value memories

M Geva, R Schuster, J Berant, O Levy - arxiv preprint arxiv:2012.14913, 2020 - arxiv.org
Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role
in the network remains under-explored. We show that feed-forward layers in transformer …