FPGA-based accelerators of deep learning networks for learning and classification: A review

A Shawahna, SM Sait, A El-Maleh - ieee Access, 2018 - ieeexplore.ieee.org
Due to recent advances in digital technologies, and availability of credible data, an area of
artificial intelligence, deep learning, has emerged and has demonstrated its ability and …

A survey of FPGA-based accelerators for convolutional neural networks

S Mittal - Neural computing and applications, 2020 - Springer
Deep convolutional neural networks (CNNs) have recently shown very high accuracy in a
wide range of cognitive tasks, and due to this, they have received significant interest from the …

Machine learning at facebook: Understanding inference at the edge

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org
At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

YS Shao, J Clemons, R Venkatesan, B Zimmer… - Proceedings of the …, 2019 - dl.acm.org
Package-level integration using multi-chip-modules (MCMs) is a promising approach for
building large-scale systems. Compared to a large monolithic die, an MCM combines many …

A configurable cloud-scale DNN processor for real-time AI

J Fowers, K Ovtcharov, M Papamichael… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
Interactive AI-powered services require low-latency evaluation of deep neural network
(DNN) models-aka"" real-time AI"". The growing demand for computationally expensive …

PUMA: A programmable ultra-efficient memristor-based accelerator for machine learning inference

A Ankit, IE Hajj, SR Chalamalasetti, G Ndu… - Proceedings of the …, 2019 - dl.acm.org
Memristor crossbars are circuits capable of performing analog matrix-vector multiplications,
overcoming the fundamental energy efficiency limitations of digital logic. They have been …

Timeloop: A systematic approach to dnn accelerator evaluation

A Parashar, P Raina, YS Shao, YH Chen… - … analysis of systems …, 2019 - ieeexplore.ieee.org
This paper presents Timeloop, an infrastructure for evaluating and exploring the architecture
design space of deep neural network (DNN) accelerators. Timeloop uses a concise and …

A high-throughput and power-efficient FPGA implementation of YOLO CNN for object detection

DT Nguyen, TN Nguyen, H Kim… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) require numerous computations and external
memory accesses. Frequent accesses to off-chip memory cause slow processing and large …

Maeri: Enabling flexible dataflow map** over dnn accelerators via reconfigurable interconnects

H Kwon, A Samajdar, T Krishna - ACM SIGPLAN Notices, 2018 - dl.acm.org
Deep neural networks (DNN) have demonstrated highly promising results across computer
vision and speech recognition, and are becoming foundational for ubiquitous AI. The …

[HTML][HTML] A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives

B Peccerillo, M Mannino, A Mondelli… - Journal of Systems …, 2022 - Elsevier
In recent years, the limits of the multicore approach emerged in the so-called “dark silicon”
issue and diminishing returns of an ever-increasing core count. Hardware manufacturers …