[HTML][HTML] Data augmentation approaches in natural language processing: A survey

B Li, Y Hou, W Che - Ai Open, 2022 - Elsevier
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where
deep learning techniques may fail. It is widely applied in computer vision then introduced to …

A survey on data augmentation for text classification

M Bayer, MA Kaufhold, C Reuter - ACM Computing Surveys, 2022 - dl.acm.org
Data augmentation, the artificial creation of training data for machine learning by
transformations, is a widely studied research field across machine learning disciplines …

A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arxiv preprint arxiv …, 2021 - arxiv.org
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

Neural machine translation for low-resource languages: A survey

S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …

Findings of the 2019 conference on machine translation (WMT19)

L Barrault, O Bojar, MR Costa-Jussa, C Federmann… - 2019 - zora.uzh.ch
This paper presents the results of the premier shared task organized alongside the
Conference on Machine Translation (WMT) 2019. Participants were asked to build machine …

An empirical survey of data augmentation for limited data learning in nlp

J Chen, D Tam, C Raffel, M Bansal… - Transactions of the …, 2023 - direct.mit.edu
NLP has achieved great progress in the past decade through the use of neural models and
large labeled datasets. The dependence on abundant data prevents NLP models from being …

An analysis of simple data augmentation for named entity recognition

X Dai, H Adel - arxiv preprint arxiv:2010.11683, 2020 - arxiv.org
Simple yet effective data augmentation techniques have been proposed for sentence-level
and sentence-pair natural language processing tasks. Inspired by these efforts, we design …

Sequence-level mixed sample data augmentation

D Guo, Y Kim, AM Rush - arxiv preprint arxiv:2011.09039, 2020 - arxiv.org
Despite their empirical success, neural networks still have difficulty capturing compositional
aspects of natural language. This work proposes a simple data augmentation approach to …

Gradient imitation reinforcement learning for low resource relation extraction

X Hu, C Zhang, Y Yang, X Li, L Lin, L Wen… - arxiv preprint arxiv …, 2021 - arxiv.org
Low-resource Relation Extraction (LRE) aims to extract relation facts from limited labeled
corpora when human annotation is scarce. Existing works either utilize self-training scheme …

Learning to generalize to more: Continuous semantic augmentation for neural machine translation

X Wei, H Yu, Y Hu, R Weng, W Luo, J **e… - arxiv preprint arxiv …, 2022 - arxiv.org
The principal task in supervised neural machine translation (NMT) is to learn to generate
target sentences conditioned on the source inputs from a set of parallel sentence pairs, and …