Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
From knowledge distillation to self-knowledge distillation: A unified approach with normalized loss and customized soft labels
Abstract Knowledge Distillation (KD) uses the teacher's prediction logits as soft labels to
guide the student, while self-KD does not need a real teacher to require the soft labels. This …
guide the student, while self-KD does not need a real teacher to require the soft labels. This …
C2kd: Bridging the modality gap for cross-modal knowledge distillation
Abstract Existing Knowledge Distillation (KD) methods typically focus on transferring
knowledge from a large-capacity teacher to a low-capacity student model achieving …
knowledge from a large-capacity teacher to a low-capacity student model achieving …
Tolerant self-distillation for image classification
Deep neural networks tend to suffer from the overfitting issue when the training data are not
enough. In this paper, we introduce two metrics from the intra-class distribution of correct …
enough. In this paper, we introduce two metrics from the intra-class distribution of correct …
Neighbor self-knowledge distillation
P Liang, W Zhang, J Wang, Y Guo - Information Sciences, 2024 - Elsevier
Abstract Self-Knowledge Distillation (Self-KD), a technique that enables neural networks to
learn from themselves, often relies on auxiliary modules or networks to generate supervisory …
learn from themselves, often relies on auxiliary modules or networks to generate supervisory …
Task-specific parameter decoupling for class incremental learning
Class incremental learning (CIL) enables deep networks to progressively learn new tasks
while remembering previously learned knowledge. A popular design for CIL involves …
while remembering previously learned knowledge. A popular design for CIL involves …
Aligned objective for soft-pseudo-label generation in supervised learning
Soft pseudo-labels, generated by the softmax predictions of the trained networks, offer a
probabilistic rather than binary form, and have been shown to improve the performance of …
probabilistic rather than binary form, and have been shown to improve the performance of …
Self-Distillation Learning Based on Temporal-Spatial Consistency for Spiking Neural Networks
Spiking neural networks (SNNs) have attracted considerable attention for their event-driven,
low-power characteristics and high biological interpretability. Inspired by knowledge …
low-power characteristics and high biological interpretability. Inspired by knowledge …
Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation
J Lv, H Yang, P Li - Advances in Neural Information …, 2025 - proceedings.neurips.cc
Since pioneering work of Hinton et al., knowledge distillation based on Kullback-Leibler
Divergence (KL-Div) has been predominant, and recently its variants have achieved …
Divergence (KL-Div) has been predominant, and recently its variants have achieved …
Self-knowledge distillation based on knowledge transfer from soft to hard examples
Y Tang, Y Chen, L **e - Image and Vision Computing, 2023 - Elsevier
To fully exploit knowledge from self-knowledge distillation network in which a student model
is progressively trained to distill its own knowledge without a pre-trained teacher model, a …
is progressively trained to distill its own knowledge without a pre-trained teacher model, a …
AI-KD: Adversarial learning and Implicit regularization for self-Knowledge Distillation
We present a novel adversarial penalized self-knowledge distillation method, named
adversarial learning and implicit regularization for self-knowledge distillation (AI-KD), which …
adversarial learning and implicit regularization for self-knowledge distillation (AI-KD), which …