Machine knowledge: Creation and curation of comprehensive knowledge bases
Equip** machines with comprehensive knowledge of the world's entities and their
relationships has been a longstanding goal of AI. Over the last decade, large-scale …
relationships has been a longstanding goal of AI. Over the last decade, large-scale …
A systematic review of machine learning techniques for stance detection and its applications
Stance detection is an evolving opinion mining research area motivated by the vast increase
in the variety and volume of user-generated content. In this regard, considerable research …
in the variety and volume of user-generated content. In this regard, considerable research …
Factscore: Fine-grained atomic evaluation of factual precision in long form text generation
Evaluating the factuality of long-form text generated by large language models (LMs) is non-
trivial because (1) generations often contain a mixture of supported and unsupported pieces …
trivial because (1) generations often contain a mixture of supported and unsupported pieces …
Evaluating verifiability in generative search engines
Generative search engines directly generate responses to user queries, along with in-line
citations. A prerequisite trait of a trustworthy generative search engine is verifiability, ie …
citations. A prerequisite trait of a trustworthy generative search engine is verifiability, ie …
ERASER: A benchmark to evaluate rationalized NLP models
State-of-the-art models in NLP are now predominantly based on deep neural networks that
are opaque in terms of how they come to make predictions. This limitation has increased …
are opaque in terms of how they come to make predictions. This limitation has increased …
Fact or fiction: Verifying scientific claims
We introduce scientific claim verification, a new task to select abstracts from the research
literature containing evidence that SUPPORTS or REFUTES a given scientific claim, and to …
literature containing evidence that SUPPORTS or REFUTES a given scientific claim, and to …
Evaluating models' local decision boundaries via contrast sets
Standard test sets for supervised learning evaluate in-distribution generalization.
Unfortunately, when a dataset has systematic gaps (eg, annotation artifacts), these …
Unfortunately, when a dataset has systematic gaps (eg, annotation artifacts), these …
Cline: Contrastive learning with semantic negative examples for natural language understanding
Despite pre-trained language models have proven useful for learning high-quality semantic
representations, these models are still vulnerable to simple perturbations. Recent works …
representations, these models are still vulnerable to simple perturbations. Recent works …
MultiFC: A real-world multi-domain dataset for evidence-based fact checking of claims
We contribute the largest publicly available dataset of naturally occurring factual claims for
the purpose of automatic claim verification. It is collected from 26 fact checking websites in …
the purpose of automatic claim verification. It is collected from 26 fact checking websites in …
Identifying the human values behind arguments
J Kiesel, M Alshomary, N Handke, X Cai… - Proceedings of the …, 2022 - aclanthology.org
This paper studies the (often implicit) human values behind natural language arguments,
such as to have freedom of thought or to be broadminded. Values are commonly accepted …
such as to have freedom of thought or to be broadminded. Values are commonly accepted …