Self-exploring language models: Active preference elicitation for online alignment

S Zhang, D Yu, H Sharma, H Zhong, Z Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Preference optimization, particularly through Reinforcement Learning from Human
Feedback (RLHF), has achieved significant success in aligning Large Language Models …

Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

Knowagent: Knowledge-augmented planning for llm-based agents

Y Zhu, S Qiao, Y Ou, S Deng, N Zhang, S Lyu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated great potential in complex reasoning
tasks, yet they fall short when tackling more sophisticated challenges, especially when …

Causal prompting: Debiasing large language model prompting based on front-door adjustment

C Zhang, L Zhang, J Wu, Y He, D Zhou - arxiv preprint arxiv:2403.02738, 2024 - arxiv.org
Despite the notable advancements of existing prompting methods, such as In-Context
Learning and Chain-of-Thought for Large Language Models (LLMs), they still face …

Selectit: Selective instruction tuning for large language models via uncertainty-aware self-reflection

L Liu, X Liu, DF Wong, D Li, Z Wang, B Hu… - arxiv preprint arxiv …, 2024 - arxiv.org
Instruction tuning (IT) is crucial to tailoring large language models (LLMs) towards human-
centric interactions. Recent advancements have shown that the careful selection of a small …

Llms-as-instructors: Learning from errors toward automating model improvement

J Ying, M Lin, Y Cao, W Tang, B Wang, Q Sun… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper introduces the innovative" LLMs-as-Instructors" framework, which leverages the
advanced Large Language Models (LLMs) to autonomously enhance the training of smaller …

Demystifying data management for large language models

X Miao, Z Jia, B Cui - Companion of the 2024 International Conference …, 2024 - dl.acm.org
Navigating the intricacies of data management in the era of Large Language Models (LLMs)
presents both challenges and opportunities for database and data management …

Instruction Embedding: Latent Representations of Instructions Towards Task Identification

Y Li, J Shi, S Feng, P Yuan, X Wang… - Advances in …, 2025 - proceedings.neurips.cc
Instruction data is crucial for improving the capability of Large Language Models (LLMs) to
align with human-level performance. Recent research LIMA demonstrates that alignment is …

Shed: Shapley-based automated dataset refinement for instruction fine-tuning

Y He, Z Wang, Z Shen, G Sun, Y Dai, Y Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
The pre-trained Large Language Models (LLMs) can be adapted for many downstream
tasks and tailored to align with human preferences through fine-tuning. Recent studies have …

Quality-weighted vendi scores and their application to diverse experimental design

Q Nguyen, AB Dieng - arxiv preprint arxiv:2405.02449, 2024 - arxiv.org
Experimental design techniques such as active search and Bayesian optimization are
widely used in the natural sciences for data collection and discovery. However, existing …