Knowledge distillation with the reused teacher classifier

D Chen, JP Mei, H Zhang, C Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Knowledge distillation aims to compress a powerful yet cumbersome teacher model
into a lightweight student model without much sacrifice of performance. For this purpose …

Knowledge distillation: A survey

J Gou, B Yu, SJ Maybank, D Tao - International Journal of Computer Vision, 2021 - Springer
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …

Distilling object detectors via decoupled features

J Guo, K Han, Y Wang, H Wu… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Knowledge distillation is a widely used paradigm for inheriting information from a
complicated teacher network to a compact student network and maintaining the strong …

Automated knowledge distillation via monte carlo tree search

L Li, P Dong, Z Wei, Y Yang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
In this paper, we present Auto-KD, the first automated search framework for optimal
knowledge distillation design. Traditional distillation techniques typically require handcrafted …

Kd-zero: Evolving knowledge distiller for any teacher-student pairs

L Li, P Dong, A Li, Z Wei… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Knowledge distillation (KD) has emerged as an effective technique for compressing
models that can enhance the lightweight model. Conventional KD methods propose various …

Shadow knowledge distillation: Bridging offline and online knowledge transfer

L Li, Z ** - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc
Abstract Knowledge distillation can be generally divided into offline and online categories
according to whether teacher model is pre-trained and persistent during the distillation …

Ressl: Relational self-supervised learning with weak augmentation

M Zheng, S You, F Wang, C Qian… - Advances in …, 2021 - proceedings.neurips.cc
Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved
great success in learning visual representations without data annotations. However, most of …

Student customized knowledge distillation: Bridging the gap between student and teacher

Y Zhu, Y Wang - Proceedings of the IEEE/CVF International …, 2021 - openaccess.thecvf.com
Abstract Knowledge distillation (KD) transfers the dark knowledge from cumbersome
networks (teacher) to lightweight (student) networks and expects the student to achieve …

Confidence-aware multi-teacher knowledge distillation

H Zhang, D Chen, C Wang - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Knowledge distillation is initially introduced to utilize additional supervision from a single
teacher model for the student model training. To boost the student performance, some recent …

LightTS: Lightweight time series classification with adaptive ensemble distillation

D Campos, M Zhang, B Yang, T Kieu, C Guo… - Proceedings of the …, 2023 - dl.acm.org
Due to the swee** digitalization of processes, increasingly vast amounts of time series
data are being produced. Accurate classification of such time series facilitates decision …