Data and its (dis) contents: A survey of dataset development and use in machine learning research

A Paullada, ID Raji, EM Bender, E Denton, A Hanna - Patterns, 2021 - cell.com
In this work, we survey a breadth of literature that has revealed the limitations of
predominant practices for dataset collection and use in the field of machine learning. We …

Ethics-based AI auditing: A systematic literature review on conceptualizations of ethical principles and knowledge contributions to stakeholders

J Laine, M Minkkinen, M Mäntymäki - Information & Management, 2024 - Elsevier
This systematic literature review synthesizes the conceptualizations of ethical principles in AI
auditing literature and the knowledge contributions to the stakeholders of AI auditing. We …

Into the laion's den: Investigating hate in multimodal datasets

A Birhane, S Han, V Boddeti… - Advances in neural …, 2023 - proceedings.neurips.cc
AbstractScale the model, scale the data, scale the compute'is the reigning sentiment in the
world of generative AI today. While the impact of model scaling has been extensively …

[BOG][B] The atlas of AI: Power, politics, and the planetary costs of artificial intelligence

K Crawford - 2021 - books.google.com
The hidden costs of artificial intelligence, from natural resources and labor to privacy and
freedom What happens when artificial intelligence saturates political life and depletes the …

Do datasets have politics? Disciplinary values in computer vision dataset development

MK Scheuerman, A Hanna, E Denton - … of the ACM on Human-Computer …, 2021 - dl.acm.org
Data is a crucial component of machine learning. The field is reliant on data to train, validate,
and test models. With increased technical capabilities, machine learning research has …

Generative AI and the politics of visibility

T Gillespie - Big Data & Society, 2024 - journals.sagepub.com
Proponents of generative AI tools claim they will supplement, even replace, the work of
cultural production. This raises questions about the politics of visibility: what kinds of stories …

[HTML][HTML] Where is the human in human-centered AI? Insights from developer priorities and user experiences

WJ Bingley, C Curtis, S Lockey, A Bialkowski… - Computers in Human …, 2023 - Elsevier
Human-centered artificial intelligence (HCAI) seeks to shift the focus in AI development from
technology to people. However, it is not clear whether existing HCAI principles and practices …

On the genealogy of machine learning datasets: A critical history of ImageNet

E Denton, A Hanna, R Amironesei, A Smart… - Big Data & …, 2021 - journals.sagepub.com
In response to growing concerns of bias, discrimination, and unfairness perpetuated by
algorithmic systems, the datasets used to train and evaluate machine learning models have …

Towards accountability for machine learning datasets: Practices from software engineering and infrastructure

B Hutchinson, A Smart, A Hanna, E Denton… - Proceedings of the …, 2021 - dl.acm.org
Datasets that power machine learning are often used, shared, and reused with little visibility
into the processes of deliberation that led to their creation. As artificial intelligence systems …

The data-production dispositif

M Miceli, J Posada - Proceedings of the ACM on human-computer …, 2022 - dl.acm.org
Machine learning (ML) depends on data to train and verify models. Very often, organizations
outsource processes related to data work (ie, generating and annotating data and …