Spvit: Enabling faster vision transformers via latency-aware soft token pruning

Z Kong, P Dong, X Ma, X Meng, W Niu, M Sun… - European conference on …, 2022 - Springer
Abstract Recently, Vision Transformer (ViT) has continuously established new milestones in
the computer vision field, while the high computation and memory cost makes its …

Evaluating the robustness of neural networks: An extreme value theory approach

TW Weng, H Zhang, PY Chen, J Yi, D Su, Y Gao… - arxiv preprint arxiv …, 2018 - arxiv.org
The robustness of neural networks to adversarial examples has received great attention due
to security implications. Despite various attack approaches to crafting visually imperceptible …

Patdnn: Achieving real-time dnn execution on mobile devices with pattern-based weight pruning

W Niu, X Ma, S Lin, S Wang, X Qian, X Lin… - Proceedings of the …, 2020 - dl.acm.org
With the emergence of a spectrum of high-end mobile devices, many applications that
formerly required desktop-level computation capability are being transferred to these …

Chex: Channel exploration for cnn model compression

Z Hou, M Qin, F Sun, X Ma, K Yuan… - Proceedings of the …, 2022 - openaccess.thecvf.com
Channel pruning has been broadly recognized as an effective technique to reduce the
computation and memory cost of deep convolutional neural networks. However …

Pconv: The missing but desirable sparsity in dnn weight pruning for real-time execution on mobile devices

X Ma, FM Guo, W Niu, X Lin, J Tang, K Ma… - Proceedings of the …, 2020 - ojs.aaai.org
Abstract Model compression techniques on Deep Neural Network (DNN) have been widely
acknowledged as an effective way to achieve acceleration on a variety of platforms, and …

Advancing model pruning via bi-level optimization

Y Zhang, Y Yao, P Ram, P Zhao… - Advances in …, 2022 - proceedings.neurips.cc
The deployment constraints in practical applications necessitate the pruning of large-scale
deep learning models, ie, promoting their weight sparsity. As illustrated by the Lottery Ticket …

Yolobile: Real-time object detection on mobile devices via compression-compilation co-design

Y Cai, H Li, G Yuan, W Niu, Y Li, X Tang… - Proceedings of the …, 2021 - ojs.aaai.org
The rapid development and wide utilization of object detection techniques have aroused
attention on both accuracy and speed of object detectors. However, the current state-of-the …

Mix and match: A novel fpga-centric deep neural network quantization framework

SE Chang, Y Li, M Sun, R Shi, HKH So… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have achieved extraordinary performance in various
application domains. To support diverse DNN models, efficient implementations of DNN …