Red-Teaming for generative AI: Silver bullet or security theater?

M Feffer, A Sinha, WH Deng, ZC Lipton… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
In response to rising concerns surrounding the safety, security, and trustworthiness of
Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red …

Roleplay-doh: Enabling domain-experts to create llm-simulated patients via eliciting and adhering to principles

R Louie, A Nandi, W Fang, C Chang, E Brunskill… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent works leverage LLMs to roleplay realistic social scenarios, aiding novices in
practicing their social skills. However, simulating sensitive interactions, such as in mental …

Skin deep: Investigating subjectivity in skin tone annotations for computer vision benchmark datasets

T Barrett, Q Chen, A Zhang - Proceedings of the 2023 ACM Conference …, 2023 - dl.acm.org
To investigate the well-observed racial disparities in computer vision systems that analyze
images of humans, researchers have turned to skin tone as a more objective annotation …

Judgment Sieve: Reducing uncertainty in group judgments through interventions targeting ambiguity versus disagreement

QZ Chen, AX Zhang - Proceedings of the ACM on Human-Computer …, 2023 - dl.acm.org
When groups of people are tasked with making a judgment, the issue of uncertainty often
arises. Existing methods to reduce uncertainty typically focus on iteratively improving …

Closing the Knowledge Gap in Designing Data Annotation Interfaces for AI-powered Disaster Management Analytic Systems

Z Ara, H Salemi, SR Hong, Y Senarath… - Proceedings of the 29th …, 2024 - dl.acm.org
Data annotation interfaces predominantly leverage ground truth labels to guide annotators
toward accurate responses. With the growing adoption of Artificial Intelligence (AI) in domain …

Are human explanations always helpful? towards objective evaluation of human natural language explanations

B Yao, P Sen, L Popa, J Hendler, D Wang - arxiv preprint arxiv …, 2023 - arxiv.org
Human-annotated labels and explanations are critical for training explainable NLP models.
However, unlike human-annotated labels whose quality is easier to calibrate (eg, with a …

Case repositories: Towards case-based reasoning for ai alignment

KJ Feng, QZ Chen, I Cheong, K **a… - arxiv preprint arxiv …, 2023 - arxiv.org
Case studies commonly form the pedagogical backbone in law, ethics, and many other
domains that face complex and ambiguous societal questions informed by human values …

Impact of annotator demographics on sentiment dataset labeling

Y Ding, J You, TK Machulla, J Jacobs, P Sen… - Proceedings of the …, 2022 - dl.acm.org
As machine learning methods become more powerful and capture more nuances of human
behavior, biases in the dataset can shape what the model learns and is evaluated on. This …

Ground-Truth, Whose Truth?--Examining the Challenges with Annotating Toxic Text Datasets

K Arhin, I Baldini, D Wei, KN Ramamurthy… - arxiv preprint arxiv …, 2021 - arxiv.org
The use of machine learning (ML)-based language models (LMs) to monitor content online
is on the rise. For toxic text identification, task-specific fine-tuning of these models are …

Mitigating voter attribute bias for fair opinion aggregation

R Ueda, K Takeuchi, H Kashima - Proceedings of the 2023 AAAI/ACM …, 2023 - dl.acm.org
The aggregation of multiple opinions plays a crucial role in decision-making, such as in
hiring and loan review, and in labeling data for supervised learning. Although majority voting …