Learning from disagreement: A survey

AN Uma, T Fornaciari, D Hovy, S Paun, B Plank… - Journal of Artificial …, 2021 - jair.org
Abstract Many tasks in Natural Language Processing (NLP) and Computer Vision (CV) offer
evidence that humans disagree, from objective tasks such as part-of-speech tagging to more …

Fact or fiction: Verifying scientific claims

D Wadden, S Lin, K Lo, LL Wang, M van Zuylen… - arxiv preprint arxiv …, 2020 - arxiv.org
We introduce scientific claim verification, a new task to select abstracts from the research
literature containing evidence that SUPPORTS or REFUTES a given scientific claim, and to …

Are we modeling the task or the annotator? an investigation of annotator bias in natural language understanding datasets

M Geva, Y Goldberg, J Berant - arxiv preprint arxiv:1908.07898, 2019 - arxiv.org
Crowdsourcing has been the prevalent paradigm for creating natural language
understanding datasets in recent years. A common crowdsourcing practice is to recruit a …

The hitchhiker's guide to testing statistical significance in natural language processing

R Dror, G Baumer, S Shlomov… - Proceedings of the 56th …, 2018 - aclanthology.org
Statistical significance testing is a standard statistical tool designed to ensure that
experimental results are not coincidental. In this opinion/theoretical paper we discuss the …

Multilingual constituency parsing with self-attention and pre-training

N Kitaev, S Cao, D Klein - arxiv preprint arxiv:1812.11760, 2018 - arxiv.org
We show that constituency parsing benefits from unsupervised pre-training across a variety
of languages and a range of pre-training conditions. We first compare the benefits of no pre …

[PDF][PDF] Beyond black & white: Leveraging annotator disagreement via soft-label multi-task learning

T Fornaciari, A Uma, S Paun, B Plank… - Proceedings of the …, 2021 - iris.unibocconi.it
Supervised learning assumes that a ground truth label exists. However, the reliability of this
ground truth depends on human annotators, who often disagree. Prior work has shown that …

[PDF][PDF] Sarcasm as contrast between a positive sentiment and negative situation

E Riloff, A Qadir, P Surve, L De Silva… - Proceedings of the …, 2013 - aclanthology.org
A common form of sarcasm on Twitter consists of a positive sentiment contrasted with a
negative situation. For example, many sarcastic tweets include a positive sentiment, such as …

Problems with evaluation of word embeddings using word similarity tasks

M Faruqui, Y Tsvetkov, P Rastogi, C Dyer - arxiv preprint arxiv …, 2016 - arxiv.org
Lacking standardized extrinsic evaluation methods for vector representations of words, the
NLP community has relied heavily on word similarity tasks as a proxy for intrinsic evaluation …

Joint extraction of events and entities within a document context

B Yang, T Mitchell - arxiv preprint arxiv:1609.03632, 2016 - arxiv.org
Events and entities are closely related; entities are often actors or participants in events and
events without entities are uncommon. The interpretation of events and entities is highly …

Learning deep semantics for test completion

P Nie, R Banerjee, JJ Li, RJ Mooney… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Writing tests is a time-consuming yet essential task during software development. We
propose to leverage recent advances in deep learning for text and code generation to assist …