A survey of deep active learning

P Ren, Y **ao, X Chang, PY Huang, Z Li… - ACM computing …, 2021 - dl.acm.org
Active learning (AL) attempts to maximize a model's performance gain while annotating the
fewest samples possible. Deep learning (DL) is greedy for data and requires a large amount …

A survey on active deep learning: from model driven to data driven

P Liu, L Wang, R Ranjan, G He, L Zhao - ACM Computing Surveys …, 2022 - dl.acm.org
Which samples should be labelled in a large dataset is one of the most important problems
for the training of deep learning. So far, a variety of active sample selection strategies related …

Fine-tuning language models from human preferences

DM Ziegler, N Stiennon, J Wu, TB Brown… - arxiv preprint arxiv …, 2019 - arxiv.org
Reward learning enables the application of reinforcement learning (RL) to tasks where
reward is defined by human judgment, building a model of reward by asking humans …

Active learning by acquiring contrastive examples

K Margatina, G Vernikos, L Barrault… - arxiv preprint arxiv …, 2021 - arxiv.org
Common acquisition functions for active learning use either uncertainty or diversity
sampling, aiming to select difficult and diverse data points from the pool of unlabeled data …

Active learning for BERT: an empirical study

LE Dor, A Halfon, A Gera, E Shnarch… - Proceedings of the …, 2020 - aclanthology.org
Real world scenarios present a challenge for text classification, since labels are usually
expensive and the data is often characterized by class imbalance. Active Learning (AL) is a …

Grad-match: Gradient matching based data subset selection for efficient deep model training

K Killamsetty, S Durga… - International …, 2021 - proceedings.mlr.press
The great success of modern machine learning models on large datasets is contingent on
extensive computational resources with high financial and environmental costs. One way to …

Efficient methods for natural language processing: A survey

M Treviso, JU Lee, T Ji, B Aken, Q Cao… - Transactions of the …, 2023 - direct.mit.edu
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …

Batch active learning at scale

G Citovsky, G DeSalvo, C Gentile… - Advances in …, 2021 - proceedings.neurips.cc
The ability to train complex and highly effective models often requires an abundance of
training data, which can easily become a bottleneck in cost, time, and computational …

Active learning by feature mixing

A Parvaneh, E Abbasnejad, D Teney… - Proceedings of the …, 2022 - openaccess.thecvf.com
The promise of active learning (AL) is to reduce labelling costs by selecting the most
valuable examples to annotate from a pool of unlabelled data. Identifying these examples is …

Glister: Generalization based data subset selection for efficient and robust learning

K Killamsetty, D Sivasubramanian… - Proceedings of the …, 2021 - ojs.aaai.org
Large scale machine learning and deep models are extremely data-hungry. Unfortunately,
obtaining large amounts of labeled data is expensive, and training state-of-the-art models …