A review of the gumbel-max trick and its extensions for discrete stochasticity in machine learning

IAM Huijben, W Kool, MB Paulus… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by
its unnormalized (log-) probabilities. Over the past years, the machine learning community …

Bringing AI to edge: From deep learning's perspective

D Liu, H Kong, X Luo, W Liu, R Subramaniam - Neurocomputing, 2022 - Elsevier
Edge computing and artificial intelligence (AI), especially deep learning algorithms, are
gradually intersecting to build the novel system, namely edge intelligence. However, the …

A-vit: Adaptive tokens for efficient vision transformer

H Yin, A Vahdat, JM Alvarez, A Mallya… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce A-ViT, a method that adaptively adjusts the inference cost of vision transformer
ViT for images of different complexity. A-ViT achieves this by automatically reducing the …

Dynamic neural networks: A survey

Y Han, G Huang, S Song, L Yang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Dynamic neural network is an emerging research topic in deep learning. Compared to static
models which have fixed computational graphs and parameters at the inference stage …

Adaptive rotated convolution for rotated object detection

Y Pu, Y Wang, Z **a, Y Han, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Rotated object detection aims to identify and locate objects in images with arbitrary
orientation. In this scenario, the oriented directions of objects vary considerably across …

Adavit: Adaptive vision transformers for efficient image recognition

L Meng, H Li, BC Chen, S Lan, Z Wu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Built on top of self-attention mechanisms, vision transformers have demonstrated
remarkable performance on a variety of vision tasks recently. While achieving excellent …

A dynamic multi-scale voxel flow network for video prediction

X Hu, Z Huang, A Huang, J Xu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The performance of video prediction has been greatly boosted by advanced deep neural
networks. However, most of the current methods suffer from large model sizes and require …

Be your own teacher: Improve the performance of convolutional neural networks via self distillation

L Zhang, J Song, A Gao, J Chen… - Proceedings of the …, 2019 - openaccess.thecvf.com
Convolutional neural networks have been widely deployed in various application scenarios.
In order to extend the applications' boundaries to some accuracy-crucial domains …

Salient object detection in the deep learning era: An in-depth survey

W Wang, Q Lai, H Fu, J Shen, H Ling… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
As an essential problem in computer vision, salient object detection (SOD) has attracted an
increasing amount of research attention over the years. Recent advances in SOD are …

Not all images are worth 16x16 words: Dynamic transformers for efficient image recognition

Y Wang, R Huang, S Song… - Advances in neural …, 2021 - proceedings.neurips.cc
Abstract Vision Transformers (ViT) have achieved remarkable success in large-scale image
recognition. They split every 2D image into a fixed number of patches, each of which is …