Fairness in machine learning: A survey

S Caton, C Haas - ACM Computing Surveys, 2024 - dl.acm.org
When Machine Learning technologies are used in contexts that affect citizens, companies as
well as researchers need to be confident that there will not be any unexpected social …

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Superglue: A stickier benchmark for general-purpose language understanding systems

A Wang, Y Pruksachatkun, N Nangia… - Advances in neural …, 2019 - proceedings.neurips.cc
In the last year, new models and methods for pretraining and transfer learning have driven
striking performance improvements across a range of language understanding tasks. The …

Language (technology) is power: A critical survey of" bias" in nlp

SL Blodgett, S Barocas, H Daumé III… - arxiv preprint arxiv …, 2020 - arxiv.org
We survey 146 papers analyzing" bias" in NLP systems, finding that their motivations are
often vague, inconsistent, and lacking in normative reasoning, despite the fact that …

Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation

K Karkkainen, J Joo - Proceedings of the IEEE/CVF winter …, 2021 - openaccess.thecvf.com
Existing public face image datasets are strongly biased toward Caucasian faces, and other
races (eg, Latino) are significantly underrepresented. The models trained from such datasets …

StereoSet: Measuring stereotypical bias in pretrained language models

M Nadeem, A Bethke, S Reddy - arxiv preprint arxiv:2004.09456, 2020 - arxiv.org
A stereotype is an over-generalized belief about a particular group of people, eg, Asians are
good at math or Asians are bad drivers. Such beliefs (biases) are known to hurt target …

Formalizing trust in artificial intelligence: Prerequisites, causes and goals of human trust in AI

A Jacovi, A Marasović, T Miller… - Proceedings of the 2021 …, 2021 - dl.acm.org
Trust is a central component of the interaction between people and AI, in that'incorrect'levels
of trust may cause misuse, abuse or disuse of the technology. But what, precisely, is the …

Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing

ID Raji, A Smart, RN White, M Mitchell… - Proceedings of the …, 2020 - dl.acm.org
Rising concern for the societal implications of artificial intelligence systems has inspired a
wave of academic and journalistic literature in which deployed systems are audited for harm …

Evaluating verifiability in generative search engines

NF Liu, T Zhang, P Liang - arxiv preprint arxiv:2304.09848, 2023 - arxiv.org
Generative search engines directly generate responses to user queries, along with in-line
citations. A prerequisite trait of a trustworthy generative search engine is verifiability, ie …

[BOOK][B] Fairness and machine learning: Limitations and opportunities

S Barocas, M Hardt, A Narayanan - 2023 - books.google.com
An introduction to the intellectual foundations and practical utility of the recent work on
fairness and machine learning. Fairness and Machine Learning introduces advanced …