Large language model as attributed training data generator: A tale of diversity and bias

Y Yu, Y Zhuang, J Zhang, Y Meng… - Advances in …, 2024 - proceedings.neurips.cc
Large language models (LLMs) have been recently leveraged as training data generators
for various natural language processing (NLP) tasks. While previous research has explored …

Generating training data with language models: Towards zero-shot language understanding

Y Meng, J Huang, Y Zhang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Pretrained language models (PLMs) have demonstrated remarkable performance in various
natural language processing tasks: Unidirectional PLMs (eg, GPT) are well known for their …

Text classification using label names only: A language model self-training approach

Y Meng, Y Zhang, J Huang, C **ong, H Ji… - arxiv preprint arxiv …, 2020 - arxiv.org
Current text classification methods typically require a good number of human-labeled
documents as training data, which can be costly and difficult to obtain in real applications …

Tuning language models as training data generators for augmentation-enhanced few-shot learning

Y Meng, M Michalski, J Huang… - International …, 2023 - proceedings.mlr.press
Recent studies have revealed the intriguing few-shot learning ability of pretrained language
models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of …

Interactive continual learning: Fast and slow thinking

B Qi, X Chen, J Gao, D Li, J Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Advanced life forms sustained by the synergistic interaction of neural cognitive mechanisms
continually acquire and transfer knowledge throughout their lifespan. In contrast …

Scimine: An efficient systematic prioritization model based on richer semantic information

F Guo, Y Luo, L Yang, Y Zhang - … of the 46th International ACM SIGIR …, 2023 - dl.acm.org
Systematic review is a crucial method that has been widely used. by scholars from different
research domains. However, screening for relevant scientific literature from paper …

Weakly supervised temporal sentence grounding with uncertainty-guided self-training

Y Huang, L Yang, Y Sato - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
The task of weakly supervised temporal sentence grounding aims at finding the
corresponding temporal moments of a language description in the video, given video …

Topic discovery via latent space clustering of pretrained language model representations

Y Meng, Y Zhang, J Huang, Y Zhang… - Proceedings of the ACM …, 2022 - dl.acm.org
Topic models have been the prominent tools for automatic topic discovery from text corpora.
Despite their effectiveness, topic models suffer from several limitations including the inability …

Distantly-supervised named entity recognition with noise-robust learning and language model augmented self-training

Y Meng, Y Zhang, J Huang, X Wang, Y Zhang… - arxiv preprint arxiv …, 2021 - arxiv.org
We study the problem of training named entity recognition (NER) models using only distantly-
labeled data, which can be automatically obtained by matching entity mentions in the raw …

Contextualized weak supervision for text classification

D Mekala, J Shang - Proceedings of the 58th Annual Meeting of …, 2020 - aclanthology.org
Weakly supervised text classification based on a few user-provided seed words has recently
attracted much attention from researchers. Existing methods mainly generate pseudo-labels …