Академия Google

P Dong, L Li, Z Wei - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Abstract Knowledge distillation (KD) is an effective training strategy to improve the
lightweight student models under the guidance of cumbersome teachers. However, the large …

Сохранить Цитировать Цитируется: 66 Похожие статьи Все версии статьи (9) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Automated knowledge distillation via monte carlo tree search

L Li, P Dong, Z Wei, Y Yang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

In this paper, we present Auto-KD, the first automated search framework for optimal
knowledge distillation design. Traditional distillation techniques typically require handcrafted …

Сохранить Цитировать Цитируется: 39 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

C2kd: Bridging the modality gap for cross-modal knowledge distillation

F Huo, W Xu, J Guo, H Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Existing Knowledge Distillation (KD) methods typically focus on transferring
knowledge from a large-capacity teacher to a low-capacity student model achieving …

Сохранить Цитировать Цитируется: 14 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Kd-zero: Evolving knowledge distiller for any teacher-student pairs

L Li, P Dong, A Li, Z Wei… - Advances in Neural …, 2023 - proceedings.neurips.cc

Abstract Knowledge distillation (KD) has emerged as an effective technique for compressing
models that can enhance the lightweight model. Conventional KD methods propose various …

Сохранить Цитировать Цитируется: 30 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Saswot: Real-time semantic segmentation architecture search without training

C Zhu, L Li, Y Wu, Z Sun - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

In this paper, we present SasWOT, the first training-free Semantic segmentation Architecture
Search (SAS) framework via an auto-discovery proxy. Semantic segmentation is widely used …

Сохранить Цитировать Цитируется: 19 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pruner-zero: Evolving symbolic pruning metric from scratch for large language models

P Dong, L Li, Z Tang, X Liu, X Pan, Q Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Despite the remarkable capabilities, Large Language Models (LLMs) face deployment
challenges due to their extensive size. Pruning methods drop a subset of weights to …

Сохранить Цитировать Цитируется: 29 Похожие статьи Все версии статьи (7) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Auto-prox: Training-free vision transformer architecture search via automatic proxy discovery

Z Wei, P Dong, Z Hui, A Li, L Li, M Lu, H Pan… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

The substantial success of Vision Transformer (ViT) in computer vision tasks is largely
attributed to the architecture design. This underscores the necessity of efficient architecture …

Сохранить Цитировать Цитируется: 26 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Applications of knowledge distillation in remote sensing: A survey

Y Himeur, N Aburaed, O Elharrouss, I Varlamis… - Information …, 2024 - Elsevier

With the ever-growing complexity of models in the field of remote sensing (RS), there is an
increasing demand for solutions that balance model accuracy with computational efficiency …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (3)

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Detkds: Knowledge distillation search for object detectors

L Li, Y Bao, P Dong, C Yang, A Li, W Luo… - … on Machine Learning, 2024 - openreview.net

In this paper, we present DetKDS, the first framework that searches for optimal detection
distillation policies. Manual design of detection distillers becomes challenging and time …

Сохранить Цитировать Цитируется: 10 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Parameter-efficient and student-friendly knowledge distillation

J Rao, X Meng, L Ding, S Qi, X Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Pre-trained models are frequently employed in multimodal learning. However, these models
have too many parameters and need too much effort to fine-tune the downstream tasks …

Сохранить Цитировать Цитируется: 53 Похожие статьи Все версии статьи (4)

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Shadow knowledge distillation: Bridging offline and online knowledge transfer

Diswot: Student architecture search for distillation without training

Automated knowledge distillation via monte carlo tree search

C2kd: Bridging the modality gap for cross-modal knowledge distillation

Kd-zero: Evolving knowledge distiller for any teacher-student pairs

Saswot: Real-time semantic segmentation architecture search without training

Pruner-zero: Evolving symbolic pruning metric from scratch for large language models

Auto-prox: Training-free vision transformer architecture search via automatic proxy discovery

Applications of knowledge distillation in remote sensing: A survey

Detkds: Knowledge distillation search for object detectors

Parameter-efficient and student-friendly knowledge distillation