Model compression for deep neural networks: A survey

Z Li, H Li, L Meng - Computers, 2023 - mdpi.com
Currently, with the rapid development of deep learning, deep neural networks (DNNs) have
been widely applied in various computer vision tasks. However, in the pursuit of …

Distilling knowledge via knowledge review

P Chen, S Liu, H Zhao, J Jia - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Abstract Knowledge distillation transfers knowledge from the teacher network to the student
one, with the goal of greatly improving the performance of the student network. Previous …

Logit standardization in knowledge distillation

S Sun, W Ren, J Li, R Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Knowledge distillation involves transferring soft labels from a teacher to a student
using a shared temperature-based softmax function. However the assumption of a shared …

Knowledge distillation from a stronger teacher

T Huang, S You, F Wang, C Qian… - Advances in Neural …, 2022 - proceedings.neurips.cc
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …

Curriculum temperature for knowledge distillation

Z Li, X Li, L Yang, B Zhao, R Song, L Luo, J Li… - Proceedings of the …, 2023 - ojs.aaai.org
Most existing distillation methods ignore the flexible role of the temperature in the loss
function and fix it as a hyper-parameter that can be decided by an inefficient grid search. In …

Am-radio: Agglomerative vision foundation model reduce all domains into one

M Ranzinger, G Heinrich, J Kautz… - Proceedings of the …, 2024 - openaccess.thecvf.com
A handful of visual foundation models (VFMs) have recently emerged as the backbones for
numerous downstream tasks. VFMs like CLIP DINOv2 SAM are trained with distinct …

Knowledge distillation with the reused teacher classifier

D Chen, JP Mei, H Zhang, C Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Knowledge distillation aims to compress a powerful yet cumbersome teacher model
into a lightweight student model without much sacrifice of performance. For this purpose …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-power computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Knowledge distillation: A survey

J Gou, B Yu, SJ Maybank, D Tao - International Journal of Computer Vision, 2021 - Springer
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …

Ensemble distillation for robust model fusion in federated learning

T Lin, L Kong, SU Stich, M Jaggi - Advances in neural …, 2020 - proceedings.neurips.cc
Federated Learning (FL) is a machine learning setting where many devices collaboratively
train a machine learning model while kee** the training data decentralized. In most of the …