- Academic Search

Hardware and software optimizations for accelerating deep neural networks: Survey of current trends, challenges, and the road ahead

M Capra, B Bussolino, A Marchisio, G Masera… - IEEE …, 2020 - ieeexplore.ieee.org

Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning
(DL) is already present in many applications ranging from computer vision for medicine to …

Zapisz Cytuj Cytowane przez 224 Powiązane artykuły Wszystkie wersje 9

[Free GPT-4]

[PDF] arxiv.org

Confuciux: Autonomous hardware resource assignment for dnn accelerators using reinforcement learning

SC Kao, G Jeong, T Krishna - 2020 53rd Annual IEEE/ACM …, 2020 - ieeexplore.ieee.org

DNN accelerators provide efficiency by leveraging reuse of activations/weights/outputs
during the DNN computations to reduce data movement from DRAM to the chip. The reuse is …

Zapisz Cytuj Cytowane przez 126 Powiązane artykuły Wszystkie wersje 7

[Free GPT-4]

[PDF] ieee.org

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org

Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

Zapisz Cytuj Cytowane przez 101 Powiązane artykuły Wszystkie wersje 7

A multi-neural network acceleration architecture

E Baek, D Kwon, J Kim - 2020 ACM/IEEE 47th Annual …, 2020 - ieeexplore.ieee.org

A cost-effective multi-tenant neural network execution is becoming one of the most important
design goals for modern neural network accelerators. For example, as emerging AI services …

Zapisz Cytuj Cytowane przez 110 Powiązane artykuły Wszystkie wersje 4

[Free GPT-4]

[PDF] arxiv.org

Procrustes: a dataflow and accelerator for sparse deep neural network training

D Yang, A Ghasemazar, X Ren, M Golub… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org

The success of DNN pruning has led to the development of energy-efficient inference
accelerators that support pruned models with sparse weight and activation tensors. Because …

Zapisz Cytuj Cytowane przez 80 Powiązane artykuły Wszystkie wersje 4

[Free GPT-4]

[PDF] researchgate.net

Laconic deep learning inference acceleration

S Sharify, AD Lascorz, M Mahmoud, M Nikolic… - Proceedings of the 46th …, 2019 - dl.acm.org

We present a method for transparently identifying ineffectual computations during inference
with Deep Learning models. Specifically, by decomposing multiplications down to the bit …

Zapisz Cytuj Cytowane przez 135 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]

[PDF] github.io

[PDF][PDF] Gemmini: An agile systolic array generator enabling systematic evaluations of deep-learning architectures

H Genc, A Haj-Ali, V Iyer, A Amid, H Mao… - arxiv preprint arxiv …, 2019 - alonamid.github.io

Advances in deep learning and neural networks have resulted in rapid development of
hardware accelerators that support them. A large majority of ASIC accelerators, however …

Zapisz Cytuj Cytowane przez 110 Powiązane artykuły Wersja HTML

[Free GPT-4]

[PDF] acm.org

FlexCNN: An end-to-end framework for composing CNN accelerators on FPGA

S Basalama, A Sohrabizadeh, J Wang, L Guo… - ACM Transactions on …, 2023 - dl.acm.org

With reduced data reuse and parallelism, recent convolutional neural networks (CNNs)
create new challenges for FPGA acceleration. Systolic arrays (SAs) are efficient, scalable …

Zapisz Cytuj Cytowane przez 38 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]

[PDF] ieee.org

Review and benchmarking of precision-scalable multiply-accumulate unit architectures for embedded neural-network processing

V Camus, L Mei, C Enz… - IEEE Journal on Emerging …, 2019 - ieeexplore.ieee.org

The current trend for deep learning has come with an enormous computational need for
billions of Multiply-Accumulate (MAC) operations per inference. Fortunately, reduced …

Zapisz Cytuj Cytowane przez 99 Powiązane artykuły Wszystkie wersje 6

[Free GPT-4]

[PDF] acm.org

Dmazerunner: Executing perfectly nested loops on dataflow accelerators

S Dave, Y Kim, S Avancha, K Lee… - ACM Transactions on …, 2019 - dl.acm.org

Dataflow accelerators feature simplicity, programmability, and energy-efficiency and are
visualized as a promising architecture for accelerating perfectly nested loops that dominate …

Zapisz Cytuj Cytowane przez 112 Powiązane artykuły Wszystkie wersje 7

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

DNN dataflow choice is overrated

Hardware and software optimizations for accelerating deep neural networks: Survey of current trends, challenges, and the road ahead

Confuciux: Autonomous hardware resource assignment for dnn accelerators using reinforcement learning

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

A multi-neural network acceleration architecture

Procrustes: a dataflow and accelerator for sparse deep neural network training

Laconic deep learning inference acceleration

[PDF][PDF] Gemmini: An agile systolic array generator enabling systematic evaluations of deep-learning architectures

FlexCNN: An end-to-end framework for composing CNN accelerators on FPGA

Review and benchmarking of precision-scalable multiply-accumulate unit architectures for embedded neural-network processing

Dmazerunner: Executing perfectly nested loops on dataflow accelerators