We're afraid language models aren't modeling ambiguity
Ambiguity is an intrinsic feature of natural language. Managing ambiguity is a key part of
human language understanding, allowing us to anticipate misunderstanding as …
human language understanding, allowing us to anticipate misunderstanding as …
Investigating reasons for disagreement in natural language inference
We investigate how disagreement in natural language inference (NLI) annotation arises. We
developed a taxonomy of disagreement sources with 10 categories spanning 3 high-level …
developed a taxonomy of disagreement sources with 10 categories spanning 3 high-level …
Stop measuring calibration when humans disagree
Calibration is a popular framework to evaluate whether a classifier knows when it does not
know-ie, its predictive probabilities are a good indication of how likely a prediction is to be …
know-ie, its predictive probabilities are a good indication of how likely a prediction is to be …
SemEval-2024 Task 6: SHROOM, a shared-task on hallucinations and related observable overgeneration mistakes
This paper presents the results of the SHROOM, a shared task focused on detecting
hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet …
hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet …
How (not) to use sociodemographic information for subjective nlp tasks
Annotators' sociodemographic backgrounds (ie, the individual compositions of their gender,
age, educational background, etc.) have a strong impact on their decisions when working on …
age, educational background, etc.) have a strong impact on their decisions when working on …
Learning with different amounts of annotation: From zero to many labels
Training NLP systems typically assumes access to annotated data that has a single human
label per example. Given imperfect labeling from annotators and inherent ambiguity of …
label per example. Given imperfect labeling from annotators and inherent ambiguity of …
You are what you annotate: Towards better models through annotator representations
Annotator disagreement is ubiquitous in natural language processing (NLP) tasks. There are
multiple reasons for such disagreements, including the subjectivity of the task, difficult cases …
multiple reasons for such disagreements, including the subjectivity of the task, difficult cases …
Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?
The model editing problem concerns how language models should learn new facts about
the world over time. While empirical research on model editing has drawn widespread …
the world over time. While empirical research on model editing has drawn widespread …
Augmenting industrial chatbots in energy systems using chatgpt generative ai
Chatbots, the automation of communicative labor, have been widely deployed in industrial
applications and systems. Built upon the Generative Pre-trained Transformer 3 (GPT-3) …
applications and systems. Built upon the Generative Pre-trained Transformer 3 (GPT-3) …
" Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?
Human label variation (HLV) is a valuable source of information that arises when multiple
human annotators provide different labels for valid reasons. In Natural Language Inference …
human annotators provide different labels for valid reasons. In Natural Language Inference …