Natural language reasoning, a survey
This survey article proposes a clearer view of Natural Language Reasoning (NLR) in the
field of Natural Language Processing (NLP), both conceptually and practically …
field of Natural Language Processing (NLP), both conceptually and practically …
Biases in large language models: origins, inventory, and discussion
In this article, we introduce and discuss the pervasive issue of bias in the large language
models that are currently at the core of mainstream approaches to Natural Language …
models that are currently at the core of mainstream approaches to Natural Language …
Towards a unified multi-dimensional evaluator for text generation
Multi-dimensional evaluation is the dominant paradigm for human evaluation in Natural
Language Generation (NLG), ie, evaluating the generated text from multiple explainable …
Language Generation (NLG), ie, evaluating the generated text from multiple explainable …
Spot: Better frozen model adaptation through soft prompt transfer
There has been growing interest in parameter-efficient methods to apply pre-trained
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …
TRUE: Re-evaluating factual consistency evaluation
Grounded text generation systems often generate text that contains factual inconsistencies,
hindering their real-world applicability. Automatic factual consistency evaluation may help …
hindering their real-world applicability. Automatic factual consistency evaluation may help …
Ext5: Towards extreme multi-task scaling for transfer learning
Despite the recent success of multi-task learning and transfer learning for natural language
processing (NLP), few works have systematically studied the effect of scaling up the number …
processing (NLP), few works have systematically studied the effect of scaling up the number …
Efficient methods for natural language processing: A survey
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …
scaling model parameters and training data; however, using only scale to improve …
Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and LLMs evaluations
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
QAFactEval: Improved QA-based factual consistency evaluation for summarization
Factual consistency is an essential quality of text summarization models in practical settings.
Existing work in evaluating this dimension can be broadly categorized into two lines of …
Existing work in evaluating this dimension can be broadly categorized into two lines of …
AlignScore: Evaluating factual consistency with a unified alignment function
Many text generation applications require the generated text to be factually consistent with
input information. Automatic evaluation of factual consistency is challenging. Previous work …
input information. Automatic evaluation of factual consistency is challenging. Previous work …