Distillm: Towards streamlined distillation for large language models
Knowledge distillation (KD) is widely used for compressing a teacher model to a smaller
student model, reducing its inference cost and memory footprint while preserving model …
student model, reducing its inference cost and memory footprint while preserving model …
Improving open-ended text generation via adaptive decoding
Current language models decode text token by token according to probabilistic distribution,
and determining the appropriate candidates for the next token is crucial to ensure …
and determining the appropriate candidates for the next token is crucial to ensure …
Reembedding and Reweighting are Needed for Tail Item Sequential Recommendation
Z Li, Y Chen, T Zhang, X Wang - THE WEB CONFERENCE 2025, 2025 - openreview.net
Large vision models (LVMs) and large language models (LLMs) are becoming cutting-edge
for sequential recommendation, given their success in broad applications. Despite their …
for sequential recommendation, given their success in broad applications. Despite their …
Semantic loss guided data efficient supervised fine tuning for Safe Responses in LLMs
Large Language Models (LLMs) generating unsafe responses to toxic prompts is a
significant issue in their applications. While various efforts aim to address this safety …
significant issue in their applications. While various efforts aim to address this safety …