Accelerating CNN inference on ASICs: A survey
Convolutional neural networks (CNNs) have proven to be a disruptive technology in most
vision, speech and image processing tasks. Given their ubiquitous acceptance, the research …
vision, speech and image processing tasks. Given their ubiquitous acceptance, the research …
Cambricon: An instruction set architecture for neural networks
Neural Networks (NN) are a family of models for a broad range of emerging machine
learning and pattern recondition applications. NN techniques are conventionally executed …
learning and pattern recondition applications. NN techniques are conventionally executed …
Origami: A 803-GOp/s/W convolutional network accelerator
An ever-increasing number of computer vision and image/video processing challenges are
being approached using deep convolutional neural networks, obtaining state-of-the-art …
being approached using deep convolutional neural networks, obtaining state-of-the-art …
14.6 a 1.42 tops/w deep convolutional neural network recognition processor for intelligent ioe systems
In this paper, we present an energy-efficient CNN processor with 4 key features:(1) a CNN-
optimized neuron processing engine (NPE),(2) a dual-range multiplyaccumulate (DRMAC) …
optimized neuron processing engine (NPE),(2) a dual-range multiplyaccumulate (DRMAC) …
14.1 A 2.9 TOPS/W deep convolutional neural network SoC in FD-SOI 28nm for intelligent embedded systems
A booming number of computer vision, speech recognition, and signal processing
applications, are increasingly benefiting from the use of deep convolutional neural networks …
applications, are increasingly benefiting from the use of deep convolutional neural networks …
Neurostream: Scalable and energy efficient deep learning with smart memory cubes
High-performance computing systems are moving towards 2.5 D and 3D memory
hierarchies, based on High Bandwidth Memory (HBM) and Hybrid Memory Cube (HMC) to …
hierarchies, based on High Bandwidth Memory (HBM) and Hybrid Memory Cube (HMC) to …
Data and hardware efficient design for convolutional neural network
YJ Lin, TS Chang - IEEE Transactions on Circuits and Systems I …, 2017 - ieeexplore.ieee.org
Hardware design of deep convolutional neural networks (CNNs) faces challenges of high
computational complexity and data bandwidth as well as huge divergence in different CNN …
computational complexity and data bandwidth as well as huge divergence in different CNN …
Data-optimized neural network traversal
JW Brothers, J Lee - US Patent 10,417,555, 2019 - Google Patents
Executing a neural network includes generating an output tile of a first layer of the neural
network by processing an input tile to the first layer and storing the output tile of the first layer …
network by processing an input tile to the first layer and storing the output tile of the first layer …
HERO: Heterogeneous embedded research platform for exploring RISC-V manycore accelerators on FPGA
Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host
processor with programmable manycore accelerators (PMCAs) to combine general-purpose …
processor with programmable manycore accelerators (PMCAs) to combine general-purpose …
VWA: Hardware efficient vectorwise accelerator for convolutional neural network
KW Chang, TS Chang - … Transactions on Circuits and Systems I …, 2019 - ieeexplore.ieee.org
Hardware accelerators for convolution neural networks (CNNs) enable real-time
applications of artificial intelligence technology. However, most of the existing designs suffer …
applications of artificial intelligence technology. However, most of the existing designs suffer …