[HTML][HTML] Transparency of deep neural networks for medical image analysis: A review of interpretability methods
Artificial Intelligence (AI) has emerged as a useful aid in numerous clinical applications for
diagnosis and treatment decisions. Deep neural networks have shown the same or better …
diagnosis and treatment decisions. Deep neural networks have shown the same or better …
[HTML][HTML] Notions of explainability and evaluation approaches for explainable artificial intelligence
Abstract Explainable Artificial Intelligence (XAI) has experienced a significant growth over
the last few years. This is due to the widespread application of machine learning, particularly …
the last few years. This is due to the widespread application of machine learning, particularly …
From attribution maps to human-understandable explanations through concept relevance propagation
The field of explainable artificial intelligence (XAI) aims to bring transparency to today's
powerful but opaque deep learning models. While local XAI methods explain individual …
powerful but opaque deep learning models. While local XAI methods explain individual …
Representation engineering: A top-down approach to ai transparency
In this paper, we identify and characterize the emerging area of representation engineering
(RepE), an approach to enhancing the transparency of AI systems that draws on insights …
(RepE), an approach to enhancing the transparency of AI systems that draws on insights …
Multimodal datasets: misogyny, pornography, and malignant stereotypes
We have now entered the era of trillion parameter machine learning models trained on
billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has …
billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has …
A survey on neural network interpretability
Along with the great success of deep neural networks, there is also growing concern about
their black-box nature. The interpretability issue affects people's trust on deep learning …
their black-box nature. The interpretability issue affects people's trust on deep learning …
Transformer interpretability beyond attention visualization
Self-attention techniques, and specifically Transformers, are dominating the field of text
processing and are becoming increasingly popular in computer vision classification tasks. In …
processing and are becoming increasingly popular in computer vision classification tasks. In …
Parameterized explainer for graph neural network
Despite recent progress in Graph Neural Networks (GNNs), explaining predictions made by
GNNs remains a challenging open problem. The leading method mainly addresses the local …
GNNs remains a challenging open problem. The leading method mainly addresses the local …
Reconstructing training data from trained neural networks
Understanding to what extent neural networks memorize training data is an intriguing
question with practical and theoretical implications. In this paper we show that in some …
question with practical and theoretical implications. In this paper we show that in some …
Transformer feed-forward layers are key-value memories
Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role
in the network remains under-explored. We show that feed-forward layers in transformer …
in the network remains under-explored. We show that feed-forward layers in transformer …