Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification
Tuning pre-trained language models (PLMs) with task-specific prompts has been a
promising approach for text classification. Particularly, previous studies suggest that prompt …
promising approach for text classification. Particularly, previous studies suggest that prompt …
[HTML][HTML] Self-training: A survey
Self-training methods have gained significant attention in recent years due to their
effectiveness in leveraging small labeled datasets and large unlabeled observations for …
effectiveness in leveraging small labeled datasets and large unlabeled observations for …
Coco-lm: Correcting and contrasting text sequences for language model pretraining
We present a self-supervised learning framework, COCO-LM, that pretrains Language
Models by COrrecting and COntrasting corrupted text sequences. Following ELECTRA-style …
Models by COrrecting and COntrasting corrupted text sequences. Following ELECTRA-style …
Text classification using embeddings: a survey
Text classification results can be hindered when just the bag-of-words model is used for
representing features, because it ignores word order and senses, which can vary with the …
representing features, because it ignores word order and senses, which can vary with the …
Harnessing artificial intelligence to combat online hate: Exploring the challenges and opportunities of large language models in hate speech detection
Large language models (LLMs) excel in many diverse applications beyond language
generation, eg, translation, summarization, and sentiment analysis. One intriguing …
generation, eg, translation, summarization, and sentiment analysis. One intriguing …
Fine-tuning pre-trained language model with weak supervision: A contrastive-regularized self-training approach
Fine-tuned pre-trained language models (LMs) have achieved enormous success in many
natural language processing (NLP) tasks, but they still require excessive labeled data in the …
natural language processing (NLP) tasks, but they still require excessive labeled data in the …
Topic discovery via latent space clustering of pretrained language model representations
Topic models have been the prominent tools for automatic topic discovery from text corpora.
Despite their effectiveness, topic models suffer from several limitations including the inability …
Despite their effectiveness, topic models suffer from several limitations including the inability …
Decoupling knowledge from memorization: Retrieval-augmented prompt learning
Prompt learning approaches have made waves in natural language processing by inducing
better few-shot performance while they still follow a parametric-based learning paradigm; …
better few-shot performance while they still follow a parametric-based learning paradigm; …
Distantly-supervised named entity recognition with noise-robust learning and language model augmented self-training
We study the problem of training named entity recognition (NER) models using only distantly-
labeled data, which can be automatically obtained by matching entity mentions in the raw …
labeled data, which can be automatically obtained by matching entity mentions in the raw …
Prboost: Prompt-based rule discovery and boosting for interactive weakly-supervised learning
Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity
on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set …
on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set …