Tetris: Scalable and efficient neural network acceleration with 3d memory

M Gao, J Pu, X Yang, M Horowitz… - Proceedings of the Twenty …, 2017 - dl.acm.org
The high accuracy of deep neural networks (NNs) has led to the development of NN
accelerators that improve performance by two orders of magnitude. However, scaling these …

Accelergy: An architecture-level energy estimation methodology for accelerator designs

YN Wu, JS Emer, V Sze - 2019 IEEE/ACM International …, 2019 - ieeexplore.ieee.org
With Moore's law slowing down and Dennard scaling ended, energy-efficient domain-
specific accelerators, such as deep neural network (DNN) processors for machine learning …

The gem5 simulator

N Binkert, B Beckmann, G Black, SK Reinhardt… - ACM SIGARCH …, 2011 - dl.acm.org
The gem5 simulation infrastructure is the merger of the best aspects of the M5 [4] and GEMS
[9] simulators. M5 provides a highly configurable simulation framework, multiple ISAs, and …

McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures

S Li, JH Ahn, RD Strong, JB Brockman… - Proceedings of the …, 2009 - dl.acm.org
This paper introduces McPAT, an integrated power, area, and timing modeling framework
that supports comprehensive design space exploration for multicore and manycore …

Practical near-data processing for in-memory analytics frameworks

M Gao, G Ayers, C Kozyrakis - 2015 International Conference …, 2015 - ieeexplore.ieee.org
The end of Dennard scaling has made all systemsenergy-constrained. For data-intensive
applications with limitedtemporal locality, the major energy bottleneck is data …

DSENT-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling

C Sun, CHO Chen, G Kurian, L Wei… - 2012 IEEE/ACM …, 2012 - ieeexplore.ieee.org
With the rise of many-core chips that require substantial bandwidth from the network on chip
(NoC), integrated photonic links have been investigated as a promising alternative to …

HRL: Efficient and flexible reconfigurable logic for near-data processing

M Gao, C Kozyrakis - 2016 IEEE International Symposium on …, 2016 - ieeexplore.ieee.org
The energy constraints due to the end of Dennard scaling, the popularity of in-memory
analytics, and the advances in 3D integration technology have led to renewed interest in …

The structural simulation toolkit

AF Rodrigues, KS Hemmert, BW Barrett… - ACM SIGMETRICS …, 2011 - dl.acm.org
As supercomputers grow, understanding their behavior and performance has become
increasingly challenging. New hurdles in scalability, programmability, power consumption …

The McPAT framework for multicore and manycore architectures: Simultaneously modeling power, area, and timing

S Li, JH Ahn, RD Strong, JB Brockman… - ACM Transactions on …, 2013 - dl.acm.org
This article introduces McPAT, an integrated power, area, and timing modeling framework
that supports comprehensive design space exploration for multicore and manycore …

Orion 2.0: A power-area simulator for interconnection networks

AB Kahng, B Li, LS Peh… - IEEE Transactions on Very …, 2011 - ieeexplore.ieee.org
As industry moves towards multicore chips, networks-on-chip (NoCs) are emerging as the
scalable fabric for interconnecting the cores. With power now the first-order design …