Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, so do risks from misalignment. To provide a comprehensive …

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Explaining machine learning models with interactive natural language conversations using TalkToModel

D Slack, S Krishna, H Lakkaraju, S Singh - Nature Machine Intelligence, 2023 - nature.com
Practitioners increasingly use machine learning (ML) models, yet models have become
more complex and harder to understand. To understand complex models, researchers have …

Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives

J Li, J Chen, Y Tang, C Wang, BA Landman… - Medical image …, 2023 - Elsevier
Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …

Post hoc explanations of language models can improve language models

S Krishna, J Ma, D Slack… - Advances in …, 2023 - proceedings.neurips.cc
Abstract Large Language Models (LLMs) have demonstrated remarkable capabilities in
performing complex tasks. Moreover, recent research has shown that incorporating human …

Applying interpretable machine learning in computational biology—pitfalls, recommendations and opportunities for new developments

V Chen, M Yang, W Cui, JS Kim, A Talwalkar, J Ma - Nature methods, 2024 - nature.com
Recent advances in machine learning have enabled the development of next-generation
predictive models for complex computational biology problems, thereby spurring the use of …

How interpretable machine learning can benefit process understanding in the geosciences

S Jiang, L Sweet, G Blougouras, A Brenning… - Earth's …, 2024 - Wiley Online Library
Abstract Interpretable Machine Learning (IML) has rapidly advanced in recent years, offering
new opportunities to improve our understanding of the complex Earth system. IML goes …

Which explanation should i choose? a function approximation perspective to characterizing post hoc explanations

T Han, S Srinivas, H Lakkaraju - Advances in neural …, 2022 - proceedings.neurips.cc
A critical problem in the field of post hoc explainability is the lack of a common foundational
goal among methods. For example, some methods are motivated by function approximation …

Policy advice and best practices on bias and fairness in AI

JM Alvarez, AB Colmenarejo, A Elobaid… - Ethics and Information …, 2024 - Springer
The literature addressing bias and fairness in AI models (fair-AI) is growing at a fast pace,
making it difficult for novel researchers and practitioners to have a bird's-eye view picture of …

Can large language models explain themselves? a study of llm-generated self-explanations

S Huang, S Mamidanna, S Jangam, Y Zhou… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) such as ChatGPT have demonstrated superior performance
on a variety of natural language processing (NLP) tasks including sentiment analysis …