Direct preference knowledge distillation for large language models
In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical
technique for transferring capabilities from teacher models to student models. However …
technique for transferring capabilities from teacher models to student models. However …
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models
The success of Large Language Models (LLM) has led researchers to explore Multimodal
Large Language Models (MLLM) for unified visual and linguistic understanding. However …
Large Language Models (MLLM) for unified visual and linguistic understanding. However …
Exploring and enhancing the transfer of distribution in knowledge distillation for autoregressive language models
Knowledge distillation (KD) is a technique that compresses large teacher models by training
smaller student models to mimic them. The success of KD in auto-regressive language …
smaller student models to mimic them. The success of KD in auto-regressive language …
Dual-Space Knowledge Distillation for Large Language Models
Knowledge distillation (KD) is known as a promising solution to compress large language
models (LLMs) via transferring their knowledge to smaller models. During this process, white …
models (LLMs) via transferring their knowledge to smaller models. During this process, white …
Llm-neo: Parameter efficient knowledge distillation for large language models
In this paper, we propose a novel LLM-Neo framework that efficiently transfers knowledge
from a large language model (LLM) teacher to a compact student. Initially, we revisit the …
from a large language model (LLM) teacher to a compact student. Initially, we revisit the …
Agent-DA: Enhancing low-resource event extraction with collaborative multi-agent data augmentation
X Tian, Y Guo, B Ge, X Yuan, H Zhang, Y Yang… - Knowledge-Based …, 2024 - Elsevier
Low-resource event extraction presents a significant challenge in real-world applications,
particularly in domains like pharmaceuticals, military and law, where data is frequently …
particularly in domains like pharmaceuticals, military and law, where data is frequently …
Self-Evolution Knowledge Distillation for LLM-based Machine Translation
Knowledge distillation (KD) has shown great promise in transferring knowledge from larger
teacher models to smaller student models. However, existing KD strategies for large …
teacher models to smaller student models. However, existing KD strategies for large …
RKLD: Reverse KL-Divergence-based Knowledge Distillation for Unlearning Personal Information in Large Language Models
B Wang, Y Zi, Y Sun, Y Zhao, B Qin - arxiv preprint arxiv:2406.01983, 2024 - arxiv.org
With the passage of the Right to Be Forgotten (RTBF) regulations and the scaling up of
language model training datasets, research on model unlearning in large language models …
language model training datasets, research on model unlearning in large language models …
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Causal language models have demonstrated remarkable capabilities, but their size poses
significant challenges for deployment in resource-constrained environments. Knowledge …
significant challenges for deployment in resource-constrained environments. Knowledge …
[HTML][HTML] Knowledge Extraction from LLMs for Scalable Historical Data Annotation
F Celli, D Mingazov - Electronics, 2024 - mdpi.com
This paper introduces a novel approach to extract knowledge from large language models
and generate structured historical datasets. We investigate the feasibility and limitations of …
and generate structured historical datasets. We investigate the feasibility and limitations of …