A survey on data augmentation for text classification

M Bayer, MA Kaufhold, C Reuter - ACM Computing Surveys, 2022 - dl.acm.org
Data augmentation, the artificial creation of training data for machine learning by
transformations, is a widely studied research field across machine learning disciplines …

Pre-trained language models for text generation: A survey

J Li, T Tang, WX Zhao, JY Nie, JR Wen - ACM Computing Surveys, 2024 - dl.acm.org
Text Generation aims to produce plausible and readable text in human language from input
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …

A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arxiv preprint arxiv …, 2021 - arxiv.org
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

A survey on recent approaches for natural language processing in low-resource scenarios

MA Hedderich, L Lange, H Adel, J Strötgen… - arxiv preprint arxiv …, 2020 - arxiv.org
Deep neural networks and huge language models are becoming omnipresent in natural
language applications. As they are known for requiring large amounts of training data, there …

AEDA: an easier data augmentation technique for text classification

A Karimi, L Rossi, A Prati - arxiv preprint arxiv:2108.13230, 2021 - arxiv.org
This paper proposes AEDA (An Easier Data Augmentation) technique to help improve the
performance on text classification tasks. AEDA includes only random insertion of …

Promda: Prompt-based data augmentation for low-resource nlu tasks

Y Wang, C Xu, Q Sun, H Hu, C Tao, X Geng… - arxiv preprint arxiv …, 2022 - arxiv.org
This paper focuses on the Data Augmentation for low-resource Natural Language
Understanding (NLU) tasks. We propose Prompt-based D} ata Augmentation model …

MELM: Data augmentation with masked entity language modeling for low-resource NER

R Zhou, X Li, R He, L Bing, E Cambria, L Si… - arxiv preprint arxiv …, 2021 - arxiv.org
Data augmentation is an effective solution to data scarcity in low-resource scenarios.
However, when applied to token-level tasks such as NER, data augmentation methods often …

MulDA: A multilingual data augmentation framework for low-resource cross-lingual NER

L Liu, B Ding, L Bing, S Joty, L Si… - Proceedings of the 59th …, 2021 - aclanthology.org
Abstract Named Entity Recognition (NER) for low-resource languages is a both practical and
challenging research problem. This paper addresses zero-shot transfer for cross-lingual …

A survey on arabic named entity recognition: Past, recent advances, and future trends

X Qu, Y Gu, Q **a, Z Li, Z Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
As more and more Arabic texts emerged on the Internet, extracting important information
from these Arabic texts is especially useful. As a fundamental technology, Named entity …

Language-guided music recommendation for video via prompt analogies

D McKee, J Salamon, J Sivic… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a method to recommend music for an input video while allowing a user to guide
music selection with free-form natural language. A key challenge of this problem setting is …