Spvit: Enabling faster vision transformers via latency-aware soft token pruning
Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …
the computer vision field, while the high computation and memory cost makes its …
Evaluating the robustness of neural networks: An extreme value theory approach
The robustness of neural networks to adversarial examples has received great attention due
to security implications. Despite various attack approaches to crafting visually imperceptible …
to security implications. Despite various attack approaches to crafting visually imperceptible …
Patdnn: Achieving real-time dnn execution on mobile devices with pattern-based weight pruning
With the emergence of a spectrum of high-end mobile devices, many applications that
formerly required desktop-level computation capability are being transferred to these …
formerly required desktop-level computation capability are being transferred to these …
Chex: Channel exploration for cnn model compression
Channel pruning has been broadly recognized as an effective technique to reduce the
computation and memory cost of deep convolutional neural networks. However …
computation and memory cost of deep convolutional neural networks. However …
Pconv: The missing but desirable sparsity in dnn weight pruning for real-time execution on mobile devices
Abstract Model compression techniques on Deep Neural Network (DNN) have been widely
acknowledged as an effective way to achieve acceleration on a variety of platforms, and …
acknowledged as an effective way to achieve acceleration on a variety of platforms, and …
Advancing model pruning via bi-level optimization
The deployment constraints in practical applications necessitate the pruning of large-scale
deep learning models, ie, promoting their weight sparsity. As illustrated by the Lottery Ticket …
deep learning models, ie, promoting their weight sparsity. As illustrated by the Lottery Ticket …
Yolobile: Real-time object detection on mobile devices via compression-compilation co-design
The rapid development and wide utilization of object detection techniques have aroused
attention on both accuracy and speed of object detectors. However, the current state-of-the …
attention on both accuracy and speed of object detectors. However, the current state-of-the …
Mix and match: A novel fpga-centric deep neural network quantization framework
Deep Neural Networks (DNNs) have achieved extraordinary performance in various
application domains. To support diverse DNN models, efficient implementations of DNN …
application domains. To support diverse DNN models, efficient implementations of DNN …