From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai

M Nauta, J Trienes, S Pathak, E Nguyen… - ACM Computing …, 2023 - dl.acm.org
The rising popularity of explainable artificial intelligence (XAI) to understand high-performing
black boxes raised the question of how to evaluate explanations of machine learning (ML) …

[PDF][PDF] Towards faithful model explanation in nlp: A survey

Q Lyu, M Apidianaki, C Callison-Burch - Computational Linguistics, 2024 - direct.mit.edu
End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to
understand. This has given rise to numerous efforts towards model explainability in recent …

Measuring association between labels and free-text rationales

S Wiegreffe, A Marasović, NA Smith - arxiv preprint arxiv:2010.12762, 2020 - arxiv.org
In interpretable NLP, we require faithful rationales that reflect the model's decision-making
process for an explained instance. While prior work focuses on extractive rationales (a …

Do models explain themselves? counterfactual simulatability of natural language explanations

Y Chen, R Zhong, N Ri, C Zhao, H He… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) are trained to imitate humans to explain human decisions.
However, do LLMs explain themselves? Can they help humans build mental models of how …

Evaluating saliency methods for neural language models

S Ding, P Koehn - arxiv preprint arxiv:2104.05824, 2021 - arxiv.org
Saliency methods are widely used to interpret neural network predictions, but different
variants of saliency methods often disagree even on the interpretations of the same …

Wino-X: Multilingual Winograd schemas for commonsense reasoning and coreference resolution

D Emelin, R Sennrich - 2021 Conference on Empirical Methods …, 2021 - research.ed.ac.uk
Winograd schemas are a well-established tool for evaluating coreference resolution (CoR)
and commonsense reasoning (CSR) capabilities of computational models. So far, schemas …

Transmart: A practical interactive machine translation system

G Huang, L Liu, X Wang, L Wang, H Li, Z Tu… - arxiv preprint arxiv …, 2021 - arxiv.org
Automatic machine translation is super efficient to produce translations yet their quality is not
guaranteed. This technique report introduces TranSmart, a practical human-machine …

Survey of arabic machine translation, methodologies, progress, and challenges

D Gamal, M Alfonse… - 2022 2nd International …, 2022 - ieeexplore.ieee.org
For the longest time, translation was a labor-intensive process that required just human
effort. While human translation remains the most reliable method of textual content …

Measuring and improving faithfulness of attention in neural machine translation

P Moradi, N Kambhatla, A Sarkar - … of the 16th Conference of the …, 2021 - aclanthology.org
While the attention heatmaps produced by neural machine translation (NMT) models seem
insightful, there is little evidence that they reflect a model's true internal reasoning. We …

Dissecting generation modes for abstractive summarization models via ablation and attribution

J Xu, G Durrett - arxiv preprint arxiv:2106.01518, 2021 - arxiv.org
Despite the prominence of neural abstractive summarization models, we know little about
how they actually form summaries and how to understand where their decisions come from …