Distillm: Towards streamlined distillation for large language models

J Ko, S Kim, T Chen, SY Yun - arxiv preprint arxiv:2402.03898, 2024 - arxiv.org
Knowledge distillation (KD) is widely used for compressing a teacher model to a smaller
student model, reducing its inference cost and memory footprint while preserving model …

Improving open-ended text generation via adaptive decoding

W Zhu, H Hao, Z He, Y Ai, R Wang - arxiv preprint arxiv:2402.18223, 2024 - arxiv.org
Current language models decode text token by token according to probabilistic distribution,
and determining the appropriate candidates for the next token is crucial to ensure …

Reembedding and Reweighting are Needed for Tail Item Sequential Recommendation

Z Li, Y Chen, T Zhang, X Wang - THE WEB CONFERENCE 2025, 2025 - openreview.net
Large vision models (LVMs) and large language models (LLMs) are becoming cutting-edge
for sequential recommendation, given their success in broad applications. Despite their …

Semantic loss guided data efficient supervised fine tuning for Safe Responses in LLMs

Y Lu, A Sinha, P Varakantham - arxiv preprint arxiv:2412.06843, 2024 - arxiv.org
Large Language Models (LLMs) generating unsafe responses to toxic prompts is a
significant issue in their applications. While various efforts aim to address this safety …