- Academic Search

C Yang, X Yu, Z An, Y Xu - … Distillation: Towards New Horizons of Intelligent …, 2023 - Springer

Deep neural networks have achieved remarkable performance for artificial intelligence
tasks. The success behind intelligent systems often relies on large-scale models with high …

保存引用被引用次数：28 相关文章所有 5 个版本

[Free GPT-4]

[PDF] arxiv.org

A survey of the self supervised learning mechanisms for vision transformers

A Khan, A Sohail, M Fiaz, M Hassan, TH Afridi… - ar** cameras. Prevalent unified methods have suffered from (1) noisy …

保存引用相关文章所有 7 个版本

[Free GPT-4]

[PDF] arxiv.org

Simple Unsupervised Knowledge Distillation With Space Similarity

A Singh, H Wang - European Conference on Computer Vision, 2024 - Springer

As per recent studies, Self-supervised learning (SSL) does not readily extend to smaller
architectures. One direction to mitigate this shortcoming while simultaneously training a …

保存引用相关文章所有 6 个版本

[Free GPT-4]

[PDF] arxiv.org

Knowledge Distillation in RNN-Attention Models for Early Prediction of Student Performance

S Leelaluk, C Tang, V Švábenský… - arxiv preprint arxiv …, 2024 - arxiv.org

Educational data mining (EDM) is a part of applied computing that focuses on automatically
analyzing data from learning contexts. Early prediction for identifying at-risk students is a …

保存引用相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer

K Zhao, N Ukita - arxiv preprint arxiv:2302.11208, 2023 - arxiv.org

Scaled dot-product attention applies a softmax function on the scaled dot-product of queries
and keys to calculate weights and then multiplies the weights and values. In this work, we …

保存引用被引用次数：1 相关文章所有 2 个版本 HTML 版

Exemplar-Free Continual Learning in Vision Transformers via Feature Attention Distillation

X Dai, J Cheng, Z Wei, B Du - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

In this paper, we propose a new approach for continual learning based on the Visual
Transformers (ViTs). The purpose of continual learning is to address the catastrophic …

保存引用相关文章

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Attention distillation: self-supervised vision transformer students need more guidance

Categories of response-based, feature-based, and relation-based knowledge distillation

A survey of the self supervised learning mechanisms for vision transformers

Simple Unsupervised Knowledge Distillation With Space Similarity

Knowledge Distillation in RNN-Attention Models for Early Prediction of Student Performance

KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer

Exemplar-Free Continual Learning in Vision Transformers via Feature Attention Distillation