Riformer: Keep your vision backbone effective but removing token mixer
This paper studies how to keep a vision backbone effective while removing token mixers in
its basic building blocks. Token mixers, as self-attention for vision transformers (ViTs), are …
its basic building blocks. Token mixers, as self-attention for vision transformers (ViTs), are …
Masked autoencoders are stronger knowledge distillers
Abstract Knowledge distillation (KD) has shown great success in improving student's
performance by mimicking the intermediate output of the high-capacity teacher in fine …
performance by mimicking the intermediate output of the high-capacity teacher in fine …
Lightweight and optimization acceleration methods for vision transformer: A review
M Chen, J Gao, W Yu - 2022 IEEE 25th International …, 2022 - ieeexplore.ieee.org
With the rapid development of technologies such as smart home, smart medical and
autonomous driving, lightweight networks play an important role in promoting the application …
autonomous driving, lightweight networks play an important role in promoting the application …
UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors
Abstract Knowledge distillation (KD) has become a standard method to boost the
performance of lightweight object detectors. Most previous works are feature-based, where …
performance of lightweight object detectors. Most previous works are feature-based, where …
Optimizing Vision Transformers with Data-Free Knowledge Transfer
The groundbreaking performance of transformers in Natural Language Processing (NLP)
tasks has led to their replacement of traditional Convolutional Neural Networks (CNNs) …
tasks has led to their replacement of traditional Convolutional Neural Networks (CNNs) …
Riformer: Keep your vision backbone effective while removing token mixer
This paper studies how to keep a vision backbone effective while removing token mixers in
its basic building blocks. Token mixers, as self-attention for vision transformers (ViTs), are …
its basic building blocks. Token mixers, as self-attention for vision transformers (ViTs), are …
An Attention-based Representation Distillation Baseline for Multi-Label Continual Learning
The field of Continual Learning (CL) has inspired numerous researchers over the years,
leading to increasingly advanced countermeasures to the issue of catastrophic forgetting …
leading to increasingly advanced countermeasures to the issue of catastrophic forgetting …
Dqformer: Dynamic query transformer for lane detection
H Yang, S Lin, R Jiang, Y Lu… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Lane detection is one of the most important tasks in self-driving. The critical purpose of lane
detection is the prediction of lane shapes. Meanwhile, it is challenging and difficult to …
detection is the prediction of lane shapes. Meanwhile, it is challenging and difficult to …