A survey of deep active learning
Active learning (AL) attempts to maximize a model's performance gain while annotating the
fewest samples possible. Deep learning (DL) is greedy for data and requires a large amount …
fewest samples possible. Deep learning (DL) is greedy for data and requires a large amount …
A survey on active deep learning: from model driven to data driven
Which samples should be labelled in a large dataset is one of the most important problems
for the training of deep learning. So far, a variety of active sample selection strategies related …
for the training of deep learning. So far, a variety of active sample selection strategies related …
Fine-tuning language models from human preferences
Reward learning enables the application of reinforcement learning (RL) to tasks where
reward is defined by human judgment, building a model of reward by asking humans …
reward is defined by human judgment, building a model of reward by asking humans …
Active learning by acquiring contrastive examples
Common acquisition functions for active learning use either uncertainty or diversity
sampling, aiming to select difficult and diverse data points from the pool of unlabeled data …
sampling, aiming to select difficult and diverse data points from the pool of unlabeled data …
Active learning for BERT: an empirical study
Real world scenarios present a challenge for text classification, since labels are usually
expensive and the data is often characterized by class imbalance. Active Learning (AL) is a …
expensive and the data is often characterized by class imbalance. Active Learning (AL) is a …
Grad-match: Gradient matching based data subset selection for efficient deep model training
The great success of modern machine learning models on large datasets is contingent on
extensive computational resources with high financial and environmental costs. One way to …
extensive computational resources with high financial and environmental costs. One way to …
Efficient methods for natural language processing: A survey
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …
scaling model parameters and training data; however, using only scale to improve …
Batch active learning at scale
The ability to train complex and highly effective models often requires an abundance of
training data, which can easily become a bottleneck in cost, time, and computational …
training data, which can easily become a bottleneck in cost, time, and computational …
Active learning by feature mixing
The promise of active learning (AL) is to reduce labelling costs by selecting the most
valuable examples to annotate from a pool of unlabelled data. Identifying these examples is …
valuable examples to annotate from a pool of unlabelled data. Identifying these examples is …
Glister: Generalization based data subset selection for efficient and robust learning
Large scale machine learning and deep models are extremely data-hungry. Unfortunately,
obtaining large amounts of labeled data is expensive, and training state-of-the-art models …
obtaining large amounts of labeled data is expensive, and training state-of-the-art models …