A holistic approach to undesired content detection in the real world

T Markov, C Zhang, S Agarwal, FE Nekoul… - Proceedings of the …, 2023 - ojs.aaai.org
We present a holistic approach to building a robust and useful natural language
classification system for real-world content moderation. The success of such a system relies …

A survey of active learning for natural language processing

Z Zhang, E Strubell, E Hovy - arxiv preprint arxiv:2210.10109, 2022 - arxiv.org
In this work, we provide a survey of active learning (AL) for its applications in natural
language processing (NLP). In addition to a fine-grained categorization of query strategies …

A Survey on Deep Active Learning: Recent Advances and New Frontiers

D Li, Z Wang, Y Chen, R Jiang, W Ding… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Active learning seeks to achieve strong performance with fewer training samples. It does this
by iteratively asking an oracle to label newly selected samples in a human-in-the-loop …

Cache & distil: Optimising API calls to large language models

G Ramírez, M Lindemann, A Birch, I Titov - arxiv preprint arxiv …, 2023 - arxiv.org
Large-scale deployment of generative AI tools often depends on costly API calls to a Large
Language Model (LLM) to fulfil user queries. To curtail the frequency of these calls, one can …

Active learning for natural language generation

Y Perlitz, A Gera, M Shmueli-Scheuer… - arxiv preprint arxiv …, 2023 - arxiv.org
The field of Natural Language Generation (NLG) suffers from a severe shortage of labeled
data due to the extremely expensive and time-consuming process involved in manual …

Phrase-level active learning for neural machine translation

J Hu, G Neubig - arxiv preprint arxiv:2106.11375, 2021 - arxiv.org
Neural machine translation (NMT) is sensitive to domain shift. In this paper, we address this
problem in an active learning setting where we can spend a given budget on translating in …

Turn-Level Active Learning for Dialogue State Tracking

Z Zhang, M Fang, F Ye, L Chen… - arxiv preprint arxiv …, 2023 - arxiv.org
Dialogue state tracking (DST) plays an important role in task-oriented dialogue systems.
However, collecting a large amount of turn-by-turn annotated dialogue data is costly and …

Bayesian active learning with pretrained language models

K Margatina, L Barrault, N Aletras - arxiv, 2021 - eprints.whiterose.ac.uk
Active Learning (AL) is a method to iteratively select data for annotation from a pool of
unlabeled data, aiming to achieve better model performance than random selection …

Active learning for neural machine translation

N Vashistha, K Singh, R Shakya - arxiv preprint arxiv:2301.00688, 2022 - arxiv.org
The machine translation mechanism translates texts automatically between different natural
languages, and Neural Machine Translation (NMT) has gained attention for its rational …

[PDF][PDF] CHIA: CHoosing instances to annotate for machine translation

R Bhatnagar, A Ganesh, K Kann - Findings of the Association for …, 2022 - par.nsf.gov
Neural machine translation (MT) systems have been shown to perform poorly on low-
resource language pairs, for which large-scale parallel data is unavailable. Making the data …