Efficient methods for natural language processing: A survey
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …
scaling model parameters and training data; however, using only scale to improve …
Distillation from heterogeneous models for top-k recommendation
Recent recommender systems have shown remarkable performance by using an ensemble
of heterogeneous models. However, it is exceedingly costly because it requires resources …
of heterogeneous models. However, it is exceedingly costly because it requires resources …
A preliminary study of the intrinsic relationship between complexity and alignment
Training large language models (LLMs) with open-domain instruction data has yielded
remarkable success in aligning to end tasks and user preferences. Extensive research has …
remarkable success in aligning to end tasks and user preferences. Extensive research has …
Unbiased, Effective, and Efficient Distillation from Heterogeneous Models for Recommender Systems
In recent years, recommender systems have achieved remarkable performance by using
ensembles of heterogeneous models. However, this approach is costly due to the resources …
ensembles of heterogeneous models. However, this approach is costly due to the resources …
Multi-level curriculum learning for multi-turn dialogue generation
Since deep learning is the dominant paradigm in the multi-turn dialogue generation task,
large-scale training data is the key factor affecting the model performance. To make full use …
large-scale training data is the key factor affecting the model performance. To make full use …
Knowledge in attention assistant for improving generalization in deep teacher–student models
Research on knowledge distillation has become active in deep neural networks. Knowledge
distillation involves training a low-capacity model from a high-capacity model. However …
distillation involves training a low-capacity model from a high-capacity model. However …
Let's Be Self-generated via Step by Step: A Curriculum Learning Approach to Automated Reasoning with Large Language Models
While Chain of Thought (CoT) prompting approaches have significantly consolidated the
reasoning capabilities of large language models (LLMs), they still face limitations that …
reasoning capabilities of large language models (LLMs), they still face limitations that …
MedProm: Bridging Dialogue Gaps in Healthcare with Knowledge-Enhanced Generative Models
In medical dialogue systems, recent advancements underscore the critical role of
incorporating relevant medical knowledge to enhance performance. However, existing …
incorporating relevant medical knowledge to enhance performance. However, existing …
Paced-curriculum distillation with prediction and label uncertainty for image segmentation
Purpose In curriculum learning, the idea is to train on easier samples first and gradually
increase the difficulty, while in self-paced learning, a pacing function defines the speed to …
increase the difficulty, while in self-paced learning, a pacing function defines the speed to …