Dataset distillation: A comprehensive review
Recent success of deep learning is largely attributed to the sheer amount of data used for
training deep neural networks. Despite the unprecedented success, the massive data …
training deep neural networks. Despite the unprecedented success, the massive data …
Distilling knowledge via knowledge review
Abstract Knowledge distillation transfers knowledge from the teacher network to the student
one, with the goal of greatly improving the performance of the student network. Previous …
one, with the goal of greatly improving the performance of the student network. Previous …
Camel: Communicative agents for" mind" exploration of large language model society
The rapid advancement of chat-based language models has led to remarkable progress in
complex task-solving. However, their success heavily relies on human input to guide the …
complex task-solving. However, their success heavily relies on human input to guide the …
A survey on model compression for large language models
Abstract Large Language Models (LLMs) have transformed natural language processing
tasks successfully. Yet, their large size and high computational needs pose challenges for …
tasks successfully. Yet, their large size and high computational needs pose challenges for …
Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting
old ones. Traditional CIL models are trained from scratch to continually acquire knowledge …
old ones. Traditional CIL models are trained from scratch to continually acquire knowledge …
Metamath: Bootstrap your own mathematical questions for large language models
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …
and exhibited excellent problem-solving ability. Despite the great success, most existing …
Decoupled knowledge distillation
State-of-the-art distillation methods are mainly based on distilling deep features from
intermediate layers, while the significance of logit distillation is greatly overlooked. To …
intermediate layers, while the significance of logit distillation is greatly overlooked. To …
Point-to-voxel knowledge distillation for lidar semantic segmentation
This article addresses the problem of distilling knowledge from a large teacher model to a
slim student network for LiDAR semantic segmentation. Directly employing previous …
slim student network for LiDAR semantic segmentation. Directly employing previous …
Knowledge distillation from a stronger teacher
Unlike existing knowledge distillation methods focus on the baseline settings, where the
teacher models and training strategies are not that strong and competing as state-of-the-art …
teacher models and training strategies are not that strong and competing as state-of-the-art …
A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …