Demystifying parallel and distributed deep learning: An in-depth concurrency analysis
Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …
applications. Accelerating their training is a major challenge and techniques range from …
A survey of techniques for optimizing deep learning on GPUs
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to
its unique features, the GPU continues to remain the most widely used accelerator for DL …
its unique features, the GPU continues to remain the most widely used accelerator for DL …
Hardnet: A low memory traffic network
P Chao, CY Kao, YS Ruan… - Proceedings of the …, 2019 - openaccess.thecvf.com
State-of-the-art neural network architectures such as ResNet, MobileNet, and DenseNet
have achieved outstanding accuracy over low MACs and small model size counterparts …
have achieved outstanding accuracy over low MACs and small model size counterparts …
fpgaConvNet: Map** regular and irregular convolutional neural networks on FPGAs
Since neural networks renaissance, convolutional neural networks (ConvNets) have
demonstrated a state-of-the-art performance in several emerging artificial intelligence tasks …
demonstrated a state-of-the-art performance in several emerging artificial intelligence tasks …
Transfer learning for sEMG hand gestures recognition using convolutional neural networks
In the realm of surface electromyography (sEMG) gesture recognition, deep learning
algorithms are seldom employed. This is due in part to the large quantity of data required for …
algorithms are seldom employed. This is due in part to the large quantity of data required for …
{PET}: Optimizing tensor programs with partially equivalent transformations and automated corrections
High-performance tensor programs are critical for efficiently deploying deep neural network
(DNN) models in real-world tasks. Existing frameworks optimize tensor programs by …
(DNN) models in real-world tasks. Existing frameworks optimize tensor programs by …