The'Problem'of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation

B Plank - arxiv preprint arxiv:2211.02570, 2022 - arxiv.org
Human variation in labeling is often considered noise. Annotation projects for machine
learning (ML) aim at minimizing human label variation, with the assumption to maximize …

A survey on semantic processing techniques

R Mao, K He, X Zhang, G Chen, J Ni, Z Yang… - Information …, 2024 - Elsevier
Semantic processing is a fundamental research domain in computational linguistics. In the
era of powerful pre-trained language models and large language models, the advancement …

Hate speech classifiers learn normative social stereotypes

AM Davani, M Atari, B Kennedy… - Transactions of the …, 2023 - direct.mit.edu
Social stereotypes negatively impact individuals' judgments about different groups and may
have a critical role in understanding language directed toward marginalized groups. Here …

Investigating reasons for disagreement in natural language inference

NJ Jiang, MC Marneffe - Transactions of the Association for …, 2022 - direct.mit.edu
We investigate how disagreement in natural language inference (NLI) annotation arises. We
developed a taxonomy of disagreement sources with 10 categories spanning 3 high-level …

Is one annotation enough?-a data-centric image classification benchmark for noisy and ambiguous label estimation

L Schmarje, V Grossmann, C Zelenka… - Advances in …, 2022 - proceedings.neurips.cc
High-quality data is necessary for modern machine learning. However, the acquisition of
such data is difficult due to noisy and ambiguous annotations of humans. The aggregation of …

Why don't you do it right? analysing annotators' disagreement in subjective tasks

M Sandri, E Leonardelli, S Tonelli… - Proceedings of the 17th …, 2023 - aclanthology.org
Annotators' disagreement in linguistic data has been recently the focus of multiple initiatives
aimed at raising awareness on issues related to 'majority voting'when aggregating diverging …

SemEval-2023 task 11: Learning with disagreements (LeWiDi)

E Leonardelli, A Uma, G Abercrombie… - arxiv preprint arxiv …, 2023 - arxiv.org
NLP datasets annotated with human judgments are rife with disagreements between the
judges. This is especially true for tasks depending on subjective judgments such as …

Scientific fact-checking: A survey of resources and approaches

J Vladika, F Matthes - arxiv preprint arxiv:2305.16859, 2023 - arxiv.org
The task of fact-checking deals with assessing the veracity of factual claims based on
credible evidence and background knowledge. In particular, scientific fact-checking is the …

ArMIS-the Arabic misogyny and sexism corpus with annotator subjective disagreements

D Almanea, M Poesio - Proceedings of the Thirteenth Language …, 2022 - aclanthology.org
The use of misogynistic and sexist language has increased in recent years in social media,
and is increasing in the Arabic world in reaction to reforms attempting to remove restrictions …

When do annotator demographics matter? measuring the influence of annotator demographics with the popquorn dataset

J Pei, D Jurgens - arxiv preprint arxiv:2306.06826, 2023 - arxiv.org
Annotators are not fungible. Their demographics, life experiences, and backgrounds all
contribute to how they label data. However, NLP has only recently considered how …