On llms-driven synthetic data generation, curation, and evaluation: A survey
Within the evolving landscape of deep learning, the dilemma of data quantity and quality has
been a long-standing problem. The recent advent of Large Language Models (LLMs) offers …
been a long-standing problem. The recent advent of Large Language Models (LLMs) offers …
When to stop? towards efficient code generation in llms with excess token prevention
Code generation aims to automatically generate code snippets that meet given natural
language requirements and plays an important role in software development. Although …
language requirements and plays an important role in software development. Although …
Supervised knowledge makes large language models better in-context learners
Large Language Models (LLMs) exhibit emerging in-context learning abilities through
prompt engineering. The recent progress in large-scale generative models has further …
prompt engineering. The recent progress in large-scale generative models has further …
Hide and seek in noise labels: Noise-robust collaborative active learning with LLMs-powered assistance
B Yuan, Y Chen, Y Zhang, W Jiang - Proceedings of the 62nd …, 2024 - aclanthology.org
Learning from noisy labels (LNL) is a challenge that arises in many real-world scenarios
where collected training data can contain incorrect or corrupted labels. Most existing …
where collected training data can contain incorrect or corrupted labels. Most existing …
Actively Learn from LLMs with Uncertainty Propagation for Generalized Category Discovery
Generalized category discovery faces a key issue: the lack of supervision for new and
unseen data categories. Traditional methods typically combine supervised pretraining with …
unseen data categories. Traditional methods typically combine supervised pretraining with …
Filling memory gaps: Enhancing continual semantic parsing via sql syntax variance-guided llms without real data replay
R Liu, J Zhang, Y Song, Y Zhang, B Yang - arxiv preprint arxiv …, 2024 - arxiv.org
Continual Semantic Parsing (CSP) aims to train parsers to convert natural language
questions into SQL across tasks with limited annotated examples, adapting to the real-world …
questions into SQL across tasks with limited annotated examples, adapting to the real-world …
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models
The Mutual Reinforcement Effect (MRE) represents a promising avenue in information
extraction and multitasking research. Nevertheless, its applicability has been constrained …
extraction and multitasking research. Nevertheless, its applicability has been constrained …
Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation
In the context of text classification, the financial burden of annotation exercises for creating
training data is a critical issue. Active learning techniques, particularly those rooted in …
training data is a critical issue. Active learning techniques, particularly those rooted in …
Self-Training for Sample-Efficient Active Learning for Text Classification with Pre-Trained Language Models
Active learning is an iterative labeling process that is used to obtain a small labeled subset,
despite the absence of labeled data, thereby enabling to train a model for supervised tasks …
despite the absence of labeled data, thereby enabling to train a model for supervised tasks …
Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation
Large language models (LLMs) have significantly advanced various natural language
processing tasks, but deploying them remains computationally expensive. Knowledge …
processing tasks, but deploying them remains computationally expensive. Knowledge …