A review of the explainability and safety of conversational agents for mental health to identify avenues for improvement

S Sarkar, M Gaur, LK Chen, M Garg… - Frontiers in Artificial …, 2023 - frontiersin.org
Virtual Mental Health Assistants (VMHAs) continuously evolve to support the overloaded
global healthcare system, which receives approximately 60 million primary care visits and 6 …

Multi-objective molecule generation using interpretable substructures

W **, R Barzilay, T Jaakkola - International conference on …, 2020 - proceedings.mlr.press
Drug discovery aims to find novel compounds with specified chemical property profiles. In
terms of generative modeling, the goal is to learn to sample molecules in the intersection of …

Edge: Explaining deep reinforcement learning policies

W Guo, X Wu, U Khan, X **ng - Advances in Neural …, 2021 - proceedings.neurips.cc
With the rapid development of deep reinforcement learning (DRL) techniques, there is an
increasing need to understand and interpret DRL policies. While recent research has …

Flexible and context-specific AI explainability: a multidisciplinary approach

V Beaudouin, I Bloch, D Bounie, S Clémençon… - arxiv preprint arxiv …, 2020 - arxiv.org
The recent enthusiasm for artificial intelligence (AI) is due principally to advances in deep
learning. Deep learning methods are remarkably accurate, but also opaque, which limits …

A game theoretic approach to class-wise selective rationalization

S Chang, Y Zhang, M Yu… - Advances in neural …, 2019 - proceedings.neurips.cc
Selection of input features such as relevant pieces of text has become a common technique
of highlighting how complex neural predictors operate. The selection can be optimized post …

Regularizing black-box models for improved interpretability

G Plumb, M Al-Shedivat, ÁA Cabrera… - Advances in …, 2020 - proceedings.neurips.cc
Most of the work on interpretable machine learning has focused on designing either
inherently interpretable models, which typically trade-off accuracy for interpretability, or post …

A framework to learn with interpretation

J Parekh, P Mozharovskyi… - Advances in Neural …, 2021 - proceedings.neurips.cc
To tackle interpretability in deep learning, we present a novel framework to jointly learn a
predictive model and its associated interpretation model. The interpreter provides both local …

First is better than last for language data influence

CK Yeh, A Taly, M Sundararajan… - Advances in Neural …, 2022 - proceedings.neurips.cc
The ability to identify influential training examples enables us to debug training data and
explain model behavior. Existing techniques to do so are based on the flow of training data …

Concept gradient: Concept-based interpretation without linear assumption

A Bai, CK Yeh, P Ravikumar, NYC Lin… - arxiv preprint arxiv …, 2022 - arxiv.org
Concept-based interpretations of black-box models are often more intuitive for humans to
understand. The most widely adopted approach for concept-based interpretation is Concept …

Focal modulation networks for interpretable sound classification

L Della Libera, C Subakan… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
The increasing success of deep neural networks has raised concerns about their inherent
black-box nature, posing challenges related to interpretability and trust. While there has …