Advances, challenges and opportunities in creating data for trustworthy AI

W Liang, GA Tadesse, D Ho, L Fei-Fei… - Nature Machine …, 2022 - nature.com
As artificial intelligence (AI) transitions from research to deployment, creating the appropriate
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …

Data and its (dis) contents: A survey of dataset development and use in machine learning research

A Paullada, ID Raji, EM Bender, E Denton, A Hanna - Patterns, 2021 - cell.com
In this work, we survey a breadth of literature that has revealed the limitations of
predominant practices for dataset collection and use in the field of machine learning. We …

[BOOK][B] Towards a standard for identifying and managing bias in artificial intelligence

R Schwartz, R Schwartz, A Vassilev, K Greene… - 2022 - dwt.com
As individuals and communities interact in and with an environment that is increasingly
virtual, they are often vulnerable to the commodification of their digital footprint. Concepts …

Confident learning: Estimating uncertainty in dataset labels

C Northcutt, L Jiang, I Chuang - Journal of Artificial Intelligence Research, 2021 - jair.org
Learning exists in the context of data, yet notions of confidence typically focus on model
predictions, not label quality. Confident learning (CL) is an alternative approach which …

When do we not need larger vision models?

B Shi, Z Wu, M Mao, X Wang, T Darrell - European Conference on …, 2024 - Springer
Scaling up the size of vision models has been the de facto standard to obtain more powerful
visual representations. In this work, we discuss the point beyond which larger vision models …

Towards unbounded machine unlearning

M Kurmanji, P Triantafillou, J Hayes… - Advances in neural …, 2024 - proceedings.neurips.cc
Deep machine unlearning is the problem of'removing'from a trained neural network a subset
of its training set. This problem is very timely and has many applications, including the key …

Partial success in closing the gap between human and machine vision

R Geirhos, K Narayanappa, B Mitzkus… - Advances in …, 2021 - proceedings.neurips.cc
A few years ago, the first CNN surpassed human performance on ImageNet. However, it
soon became clear that machines lack robustness on more challenging test cases, a major …

Diversify your vision datasets with automatic diffusion-based augmentation

L Dunlap, A Umino, H Zhang, J Yang… - Advances in neural …, 2023 - proceedings.neurips.cc
Many fine-grained classification tasks, like rare animal identification, have limited training
data and consequently classifiers trained on these datasets often fail to generalize to …

Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset

M Groh, C Harris, L Soenksen, F Lau… - Proceedings of the …, 2021 - openaccess.thecvf.com
How does the accuracy of deep neural network models trained to classify clinical images of
skin conditions vary across skin color? While recent studies demonstrate computer vision …

Evaluations of machine learning privacy defenses are misleading

M Aerni, J Zhang, F Tramèr - Proceedings of the 2024 on ACM SIGSAC …, 2024 - dl.acm.org
Empirical defenses for machine learning privacy forgo the provable guarantees of
differential privacy in the hope of achieving higher utility while resisting realistic adversaries …