A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications
As general-purpose processors have hit the power wall and chip fabrication cost escalates
alarmingly, coarse-grained reconfigurable architectures (CGRAs) are attracting increasing …
alarmingly, coarse-grained reconfigurable architectures (CGRAs) are attracting increasing …
A modern primer on processing in memory
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …
design choice goes directly against at least three key trends in computing that cause …
Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology
Many important applications trigger bulk bitwise operations, ie, bitwise operations on large
bit vectors. In fact, recent works design techniques that exploit fast bulk bitwise operations to …
bit vectors. In fact, recent works design techniques that exploit fast bulk bitwise operations to …
Pipelayer: A pipelined reram-based accelerator for deep learning
Convolution neural networks (CNNs) are the heart of deep learning applications. Recent
works PRIME [1] and ISAAC [2] demonstrated the promise of using resistive random access …
works PRIME [1] and ISAAC [2] demonstrated the promise of using resistive random access …
Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory
Processing-in-memory (PIM) is a promising solution to address the" memory wall"
challenges for future computer systems. Prior proposed PIM architectures put additional …
challenges for future computer systems. Prior proposed PIM architectures put additional …
Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system
Many modern workloads, such as neural networks, databases, and graph processing, are
fundamentally memory-bound. For such workloads, the data movement between main …
fundamentally memory-bound. For such workloads, the data movement between main …
Google workloads for consumer devices: Mitigating data movement bottlenecks
We are experiencing an explosive growth in the number of consumer devices, including
smartphones, tablets, web-based computers such as Chromebooks, and wearable devices …
smartphones, tablets, web-based computers such as Chromebooks, and wearable devices …
A scalable processing-in-memory accelerator for parallel graph processing
The explosion of digital data and the ever-growing need for fast data analysis have made in-
memory big-data processing in computer systems increasingly important. In particular, large …
memory big-data processing in computer systems increasingly important. In particular, large …
Processing data where it makes sense: Enabling in-memory computation
Today's systems are overwhelmingly designed to move data to computation. This design
choice goes directly against at least three key trends in systems that cause performance …
choice goes directly against at least three key trends in systems that cause performance …
PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture
Processing-in-memory (PIM) is rapidly rising as a viable solution for the memory wall crisis,
rebounding from its unsuccessful attempts in 1990s due to practicality concerns, which are …
rebounding from its unsuccessful attempts in 1990s due to practicality concerns, which are …