Representation in AI evaluations

AS Bergman, LA Hendricks, M Rauh, B Wu… - Proceedings of the …, 2023 - dl.acm.org
Calls for representation in artificial intelligence (AI) and machine learning (ML) are
widespread, with" representation" or" representativeness" generally understood to be both …

[KNJIGA][B] Introduction to AI safety, ethics, and society

D Hendrycks - 2025 - library.oapen.org
As AI technology is rapidly progressing in capability and being adopted more widely across
society, it is more important than ever to understand the potential risks AI may pose and how …

Self-destructing models: Increasing the costs of harmful dual uses of foundation models

P Henderson, E Mitchell, C Manning… - Proceedings of the …, 2023 - dl.acm.org
A growing ecosystem of large, open-source foundation models has reduced the labeled
data and technical expertise necessary to apply machine learning to many new problems …

Machine learning data practices through a data curation lens: An evaluation framework

E Bhardwaj, H Gujral, S Wu, C Zogheib… - Proceedings of the …, 2024 - dl.acm.org
Studies of dataset development in machine learning call for greater attention to the data
practices that make model development possible and shape its outcomes. Many argue that …

Visalign: Dataset for measuring the alignment between ai and humans in visual perception

J Lee, S Kim, S Won, J Lee… - Advances in …, 2023 - proceedings.neurips.cc
AI alignment refers to models acting towards human-intended goals, preferences, or ethical
principles. Analyzing the similarity between models and humans can be a proxy measure for …

The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks Track

E Bhardwaj, H Gujral, S Wu… - Advances in …, 2025 - proceedings.neurips.cc
Data curation is a field with origins in librarianship and archives, whose scholarship and
thinking on data issues go back centuries, if not millennia. The field of machine learning is …

Label Smarter, Not Harder: CleverLabel for Faster Annotation of Ambiguous Image Classification with Higher Quality

L Schmarje, V Grossmann, T Michels… - … German Conference on …, 2023 - Springer
High-quality data is crucial for the success of machine learning, but labeling large datasets
is often a time-consuming and costly process. While semi-supervised learning can help …

A Large-scale Dataset with Behavior, Attributes, and Content of Mobile Short-video Platform

Y Shang, C Gao, N Li, Y Li - arxiv preprint arxiv:2502.05922, 2025 - arxiv.org
Short-video platforms show an increasing impact on people's daily lives nowadays, with
billions of active users spending plenty of time each day. The interactions between users …

Visalign: Dataset for measuring the degree of alignment between ai and humans in visual perception

J Lee, S Kim, S Won, J Lee, M Ghassemi… - arxiv preprint arxiv …, 2023 - arxiv.org
AI alignment refers to models acting towards human-intended goals, preferences, or ethical
principles. Given that most large-scale deep learning models act as black boxes and cannot …