Cosa: Scheduling by constrained optimization for spatial accelerators
Recent advances in Deep Neural Networks (DNNs) have led to active development of
specialized DNN accelerators, many of which feature a large number of processing …
specialized DNN accelerators, many of which feature a large number of processing …
Modern development methods and tools for embedded reconfigurable systems: A survey
Heterogeneous reconfigurable systems provide drastically higher performance and lower
power consumption than traditional CPU-centric systems. Moreover, they do it at much lower …
power consumption than traditional CPU-centric systems. Moreover, they do it at much lower …
A decade of reconfigurable computing: a visionary retrospective
R Hartenstein - Proceedings design, automation and test in …, 2001 - ieeexplore.ieee.org
The paper surveys a decade of R&D on coarse grain reconfigurable hardware and related
CAD, points out why this emerging discipline is heading toward a dichotomy of computing …
CAD, points out why this emerging discipline is heading toward a dichotomy of computing …
Active pages: A computation model for intelligent memory
Microprocessors and memory systems suffer from a growing gap in performance. We
introduce Active Pages, a computation model which addresses this gap by shifting data …
introduce Active Pages, a computation model which addresses this gap by shifting data …
Using machine learning to focus iterative optimization
Iterative compiler optimization has been shown to outperform static approaches. This,
however, is at the cost of large numbers of evaluations of the program. This paper develops …
however, is at the cost of large numbers of evaluations of the program. This paper develops …
Simultaneous multithreading: A platform for next-generation processors
Simultaneous multithreading is a processor design which consumes both thread-level and
instruction-level parallelism. In SMT processors, thread-level parallelism can come from …
instruction-level parallelism. In SMT processors, thread-level parallelism can come from …
Continuous profiling: Where have all the cycles gone?
This article describes the Digital Continuous Profiling Infrastructure, a sampling-based
profiling system designed to run continuously on production systems. The system supports …
profiling system designed to run continuously on production systems. The system supports …
[書籍][B] Modern compiler design
" Modern Compiler Design" makes the topic of compiler design more accessible by focusing
on principles and techniques of wide application. By carefully distinguishing between the …
on principles and techniques of wide application. By carefully distinguishing between the …
A single-chip multiprocessor
Presents the case for billion-transistor processor architectures that will consist of chip
multiprocessors (CMPs): multiple (four to 16) simple, fast processors on one chip. In their …
multiprocessors (CMPs): multiple (four to 16) simple, fast processors on one chip. In their …