A survey on data augmentation for text classification

M Bayer, MA Kaufhold, C Reuter - ACM Computing Surveys, 2022 - dl.acm.org
Data augmentation, the artificial creation of training data for machine learning by
transformations, is a widely studied research field across machine learning disciplines …

Image data augmentation approaches: A comprehensive survey and future directions

T Kumar, R Brennan, A Mileo, M Bendechache - IEEE Access, 2024 - ieeexplore.ieee.org
Deep learning algorithms have exhibited impressive performance across various computer
vision tasks; however, the challenge of overfitting persists, especially when dealing with …

A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arxiv preprint arxiv …, 2021 - arxiv.org
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

An empirical survey of data augmentation for limited data learning in NLP

J Chen, D Tam, C Raffel, M Bansal… - Transactions of the …, 2023 - direct.mit.edu
NLP has achieved great progress in the past decade through the use of neural models and
large labeled datasets. The dependence on abundant data prevents NLP models from being …

[HTML][HTML] Data augmentation techniques in natural language processing

LFAO Pellicer, TM Ferreira, AHR Costa - Applied Soft Computing, 2023 - Elsevier
Data Augmentation (DA) methods–a family of techniques designed for synthetic generation
of training data–have shown remarkable results in various Deep Learning and Machine …

Cybert: Contextualized embeddings for the cybersecurity domain

P Ranade, A Piplai, A Joshi… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
We present CyBERT, a domain-specific Bidirectional Encoder Representations from
Transformers (BERT) model, fine-tuned with a large corpus of textual cybersecurity data …

Does gpt-3 generate empathetic dialogues? a novel in-context example selection method and automatic evaluation metric for empathetic dialogue generation

YJ Lee, CG Lim, HJ Choi - … of the 29th International Conference on …, 2022 - aclanthology.org
Since empathy plays a crucial role in increasing social bonding between people, many
studies have designed their own dialogue agents to be empathetic using the well …

Exploring new frontiers in agricultural nlp: Investigating the potential of large language models for food applications

S Rezayi, Z Liu, Z Wu, C Dhakal, B Ge… - … Transactions on Big …, 2024 - ieeexplore.ieee.org
This paper explores new frontiers in agricultural natural language processing (NLP) by
investigating the effectiveness of food-related text corpora for pretraining transformer-based …

Generating fake cyber threat intelligence using transformer-based models

P Ranade, A Piplai, S Mittal, A Joshi… - 2021 International Joint …, 2021 - ieeexplore.ieee.org
Cyber-defense systems are being developed to automatically ingest Cyber Threat
Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge …

The parrot dilemma: Human-labeled vs. LLM-augmented data in classification tasks

AG Møller, JA Dalsgaard, A Pera, LM Aiello - arxiv preprint arxiv …, 2023 - arxiv.org
In the realm of Computational Social Science (CSS), practitioners often navigate complex,
low-resource domains and face the costly and time-intensive challenges of acquiring and …