A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arxiv preprint arxiv …, 2021 - arxiv.org
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

Detecting and mitigating hallucinations in multilingual summarisation

Y Qiu, Y Ziser, A Korhonen, EM Ponti… - arxiv preprint arxiv …, 2023 - arxiv.org
Hallucinations pose a significant challenge to the reliability of neural models for abstractive
summarisation. While automatically generated summaries may be fluent, they often lack …

Text data augmentation using generative adversarial networks–a systematic review

KS Kanishka Silva, B Can, R Sarwar… - Journal of …, 2023 - e-space.mmu.ac.uk
Insufficient data is one of the main drawbacks in natural language processing tasks, and the
most prevalent solution is to collect a decent amount of data that will be enough for the …

Multilayer encoder and single-layer decoder for abstractive Arabic text summarization

D Suleiman, A Awajan - Knowledge-Based Systems, 2022 - Elsevier
In this paper, an abstractive Arabic text summarization model that is based on sequence-to-
sequence recurrent neural networks is proposed. It consists of a multilayer encoder and …

A feature-space multimodal data augmentation technique for text-video retrieval

A Falcon, G Serra, O Lanz - Proceedings of the 30th ACM International …, 2022 - dl.acm.org
Every hour, huge amounts of visual contents are posted on social media and user-
generated content platforms. To find relevant videos by means of a natural language query …

Ffci: A framework for interpretable automatic evaluation of summarization

F Koto, T Baldwin, JH Lau - Journal of Artificial Intelligence Research, 2022 - jair.org
In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that
comprises four elements: faithfulness (degree of factual consistency with the source), focus …

Long document summarization in a low resource setting using pretrained language models

A Bajaj, P Dangati, K Krishna, PA Kumar… - arxiv preprint arxiv …, 2021 - arxiv.org
Abstractive summarization is the task of compressing a long document into a coherent short
document while retaining salient information. Modern abstractive summarization methods …

[HTML][HTML] Align-then-abstract representation learning for low-resource summarization

G Moro, L Ragazzi - Neurocomputing, 2023 - Elsevier
Generative transformer-based models have achieved state-of-the-art performance in text
summarization. Nevertheless, they still struggle in real-world scenarios with long documents …

Summarization of Lengthy Legal Documents via Abstractive Dataset Building: An Extract-then-Assign Approach

D Jain, MD Borah, A Biswas - Expert Systems with Applications, 2024 - Elsevier
Abstract Development of effective automatic summarization approaches for legal documents
suffer from several challenges like extremely long document-summary pairs, lack of large …

Counterfactual data augmentation improves factuality of abstractive summarization

D Rajagopal, S Shakeri, CN Santos, E Hovy… - arxiv preprint arxiv …, 2022 - arxiv.org
Abstractive summarization systems based on pretrained language models often generate
coherent but factually inconsistent sentences. In this paper, we present a counterfactual data …