Explainable artificial intelligence: a comprehensive review

D Minh, HX Wang, YF Li, TN Nguyen - Artificial Intelligence Review, 2022 - Springer
Thanks to the exponential growth in computing power and vast amounts of data, artificial
intelligence (AI) has witnessed remarkable developments in recent years, enabling it to be …

Post-hoc interpretability for neural nlp: A survey

A Madsen, S Reddy, S Chandar - ACM Computing Surveys, 2022 - dl.acm.org
Neural networks for NLP are becoming increasingly complex and widespread, and there is a
growing concern if these models are responsible to use. Explaining models helps to address …

Explainability for large language models: A survey

H Zhao, H Chen, F Yang, N Liu, H Deng, H Cai… - ACM Transactions on …, 2024 - dl.acm.org
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Interpretable deep learning: Interpretation, interpretability, trustworthiness, and beyond

X Li, H **ong, X Li, X Wu, X Zhang, J Liu, J Bian… - … and Information Systems, 2022 - Springer
Deep neural networks have been well-known for their superb handling of various machine
learning and artificial intelligence tasks. However, due to their over-parameterized black-box …

Toward transparent ai: A survey on interpreting the inner structures of deep neural networks

T Räuker, A Ho, S Casper… - 2023 ieee conference …, 2023 - ieeexplore.ieee.org
The last decade of machine learning has seen drastic increases in scale and capabilities.
Deep neural networks (DNNs) are increasingly being deployed in the real world. However …

Explainable artificial intelligence: a systematic review

G Vilone, L Longo - arxiv preprint arxiv:2006.00093, 2020 - arxiv.org
Explainable Artificial Intelligence (XAI) has experienced a significant growth over the last few
years. This is due to the widespread application of machine learning, particularly deep …

A multiscale visualization of attention in the transformer model

J Vig - arxiv preprint arxiv:1906.05714, 2019 - arxiv.org
The Transformer is a sequence model that forgoes traditional recurrent architectures in favor
of a fully attention-based approach. Besides improving performance, an advantage of using …

Artificial Intelligence and Black‐Box Medical Decisions: Accuracy versus Explainability

AJ London - Hastings Center Report, 2019 - Wiley Online Library
Although decision‐making algorithms are not new to medicine, the availability of vast stores
of medical data, gains in computing power, and breakthroughs in machine learning are …

On the explainability of natural language processing deep models

JE Zini, M Awad - ACM Computing Surveys, 2022 - dl.acm.org
Despite their success, deep networks are used as black-box models with outputs that are not
easily explainable during the learning and the prediction phases. This lack of interpretability …