Explainable artificial intelligence: a comprehensive review

D Minh, HX Wang, YF Li, TN Nguyen - Artificial Intelligence Review, 2022 - Springer
Thanks to the exponential growth in computing power and vast amounts of data, artificial
intelligence (AI) has witnessed remarkable developments in recent years, enabling it to be …

A survey on neural network interpretability

Y Zhang, P Tiňo, A Leonardis… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Along with the great success of deep neural networks, there is also growing concern about
their black-box nature. The interpretability issue affects people's trust on deep learning …

Does localization inform editing? surprising differences in causality-based localization vs. knowledge editing in language models

P Hase, M Bansal, B Kim… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Language models learn a great quantity of factual information during pretraining,
and recent work localizes this information to specific model weights like mid-layer MLP …

Interpretable machine learning: Fundamental principles and 10 grand challenges

C Rudin, C Chen, Z Chen, H Huang… - Statistic …, 2022 - projecteuclid.org
Interpretability in machine learning (ML) is crucial for high stakes decisions and
troubleshooting. In this work, we provide fundamental principles for interpretable ML, and …

Transformer interpretability beyond attention visualization

H Chefer, S Gur, L Wolf - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Self-attention techniques, and specifically Transformers, are dominating the field of text
processing and are becoming increasingly popular in computer vision classification tasks. In …

Generic attention-model explainability for interpreting bi-modal and encoder-decoder transformers

H Chefer, S Gur, L Wolf - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Transformers are increasingly dominating multi-modal reasoning tasks, such as visual
question answering, achieving state-of-the-art results thanks to their ability to contextualize …

Explaining deep neural networks and beyond: A review of methods and applications

W Samek, G Montavon, S Lapuschkin… - Proceedings of the …, 2021 - ieeexplore.ieee.org
With the broader and highly successful usage of machine learning (ML) in industry and the
sciences, there has been a growing demand for explainable artificial intelligence (XAI) …

Understanding the role of individual units in a deep neural network

D Bau, JY Zhu, H Strobelt, A Lapedriza, B Zhou… - Proceedings of the …, 2020 - pnas.org
Deep neural networks excel at finding hierarchical representations that solve complex tasks
over large datasets. How can we humans understand these learned representations? In this …

Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing

V Monga, Y Li, YC Eldar - IEEE Signal Processing Magazine, 2021 - ieeexplore.ieee.org
Deep neural networks provide unprecedented performance gains in many real-world
problems in signal and image processing. Despite these gains, the future development and …

A survey on explainable artificial intelligence (xai): Toward medical xai

E Tjoa, C Guan - IEEE transactions on neural networks and …, 2020 - ieeexplore.ieee.org
Recently, artificial intelligence and machine learning in general have demonstrated
remarkable performances in many tasks, from image processing to natural language …