Knowledge distillation: A survey
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …
especially for computer vision tasks. The great success of deep learning is mainly due to its …
Enabling all in-edge deep learning: A literature review
In recent years, deep learning (DL) models have demonstrated remarkable achievements
on non-trivial tasks such as speech recognition, image processing, and natural language …
on non-trivial tasks such as speech recognition, image processing, and natural language …
A survey of model compression strategies for object detection
Z Lyu, T Yu, F Pan, Y Zhang, J Luo, D Zhang… - Multimedia tools and …, 2024 - Springer
Deep neural networks (DNNs) have achieved great success in many object detection tasks.
However, such DNNS-based large object detection models are generally computationally …
However, such DNNS-based large object detection models are generally computationally …
Distilling global and local logits with densely connected relations
In prevalent knowledge distillation, logits in most image recognition models are computed by
global average pooling, then used to learn to encode the high-level and task-relevant …
global average pooling, then used to learn to encode the high-level and task-relevant …
[HTML][HTML] A Survey on Knowledge Distillation: Recent Advancements
A Moslemi, A Briskina, Z Dang, J Li - Machine Learning with Applications, 2024 - Elsevier
Deep learning has achieved notable success across academia, medicine, and industry. Its
ability to identify complex patterns in large-scale data and to manage millions of parameters …
ability to identify complex patterns in large-scale data and to manage millions of parameters …
Collaborative multi-teacher knowledge distillation for learning low bit-width deep neural networks
Abstract Knowledge distillation which learns a lightweight student model by distilling
knowledge from a cumbersome teacher model is an attractive approach for learning …
knowledge from a cumbersome teacher model is an attractive approach for learning …
Conditional pseudo-supervised contrast for data-Free knowledge distillation
Data-free knowledge distillation (DFKD) is an effective manner to solve model compression
and transmission restrictions while retaining privacy protection, which has attracted …
and transmission restrictions while retaining privacy protection, which has attracted …
Quantized feature distillation for network quantization
Neural network quantization aims to accelerate and trim full-precision neural network
models by using low bit approximations. Methods adopting the quantization aware training …
models by using low bit approximations. Methods adopting the quantization aware training …
Fbi-llm: Scaling up fully binarized llms from scratch via autoregressive distillation
This work presents a Fully BInarized Large Language Model (FBI-LLM), demonstrating for
the first time how to train a large-scale binary language model from scratch (not the partial …
the first time how to train a large-scale binary language model from scratch (not the partial …
Self-Supervised Quantization-Aware Knowledge Distillation
Quantization-aware training (QAT) and Knowledge Distillation (KD) are combined to achieve
competitive performance in creating low-bit deep learning models. However, existing works …
competitive performance in creating low-bit deep learning models. However, existing works …