Google Академик

S Zhang, D Yu, H Sharma, H Zhong, Z Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Preference optimization, particularly through Reinforcement Learning from Human
Feedback (RLHF), has achieved significant success in aligning Large Language Models …

Сачувај Цитирај 27 пута наведен Сродни чланци Све верзије (4) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

Сачувај Цитирај 8 пута наведен Сродни чланци Све верзије (3) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Knowagent: Knowledge-augmented planning for llm-based agents

Y Zhu, S Qiao, Y Ou, S Deng, N Zhang, S Lyu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have demonstrated great potential in complex reasoning
tasks, yet they fall short when tackling more sophisticated challenges, especially when …

Сачувај Цитирај 27 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Causal prompting: Debiasing large language model prompting based on front-door adjustment

C Zhang, L Zhang, J Wu, Y He, D Zhou - arxiv preprint arxiv:2403.02738, 2024 - arxiv.org

Despite the notable advancements of existing prompting methods, such as In-Context
Learning and Chain-of-Thought for Large Language Models (LLMs), they still face …

Сачувај Цитирај 14 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Selectit: Selective instruction tuning for large language models via uncertainty-aware self-reflection

L Liu, X Liu, DF Wong, D Li, Z Wang, B Hu… - arxiv preprint arxiv …, 2024 - arxiv.org

Instruction tuning (IT) is crucial to tailoring large language models (LLMs) towards human-
centric interactions. Recent advancements have shown that the careful selection of a small …

Сачувај Цитирај 11 пута наведен Сродни чланци Све верзије (3) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Llms-as-instructors: Learning from errors toward automating model improvement

J Ying, M Lin, Y Cao, W Tang, B Wang, Q Sun… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper introduces the innovative" LLMs-as-Instructors" framework, which leverages the
advanced Large Language Models (LLMs) to autonomously enhance the training of smaller …

Сачувај Цитирај 9 пута наведен Сродни чланци Све верзије (6) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Demystifying data management for large language models

X Miao, Z Jia, B Cui - Companion of the 2024 International Conference …, 2024 - dl.acm.org

Navigating the intricacies of data management in the era of Large Language Models (LLMs)
presents both challenges and opportunities for database and data management …

Сачувај Цитирај 13 пута наведен Сродни чланци

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Instruction Embedding: Latent Representations of Instructions Towards Task Identification

Y Li, J Shi, S Feng, P Yuan, X Wang… - Advances in …, 2025 - proceedings.neurips.cc

Instruction data is crucial for improving the capability of Large Language Models (LLMs) to
align with human-level performance. Recent research LIMA demonstrates that alignment is …

Сачувај Цитирај 1 пута наведен Сродни чланци Све верзије (3) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Shed: Shapley-based automated dataset refinement for instruction fine-tuning

Y He, Z Wang, Z Shen, G Sun, Y Dai, Y Wu… - arxiv preprint arxiv …, 2024 - arxiv.org

The pre-trained Large Language Models (LLMs) can be adapted for many downstream
tasks and tailored to align with human preferences through fine-tuning. Recent studies have …

Сачувај Цитирај 5 пута наведен Сродни чланци Све верзије (4) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Quality-weighted vendi scores and their application to diverse experimental design

Q Nguyen, AB Dieng - arxiv preprint arxiv:2405.02449, 2024 - arxiv.org

Experimental design techniques such as active search and Bayesian optimization are
widely used in the natural sciences for data collection and discovery. However, existing …

Сачувај Цитирај 6 пута наведен Сродни чланци Све верзије (7) HTML верзија

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Self-evolved diverse data sampling for efficient instruction tuning

Self-exploring language models: Active preference elicitation for online alignment

Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Knowagent: Knowledge-augmented planning for llm-based agents

Causal prompting: Debiasing large language model prompting based on front-door adjustment

Selectit: Selective instruction tuning for large language models via uncertainty-aware self-reflection

Llms-as-instructors: Learning from errors toward automating model improvement

Demystifying data management for large language models

Instruction Embedding: Latent Representations of Instructions Towards Task Identification

Shed: Shapley-based automated dataset refinement for instruction fine-tuning

Quality-weighted vendi scores and their application to diverse experimental design