Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

X Zou, S Xu, X Chen, L Yan, Y Han - Science China Information Sciences, 2021 - Springer
The “memory wall” problem or so-called von Neumann bottleneck limits the efficiency of
conventional computer architectures, which move data from memory to CPU for …

Leveraging 3D technology for improved reliability

N Madan, R Balasubramonian - 40th Annual IEEE/ACM …, 2007 - ieeexplore.ieee.org
Aggressive technology scaling over the years has helped improve processor performance
but has caused a reduction in processor reliability. Shrinking transistor sizes and lower …

CMP network-on-chip overlaid with multi-band RF-interconnect

MF Chang, J Cong, A Kaplan, M Naik… - 2008 IEEE 14th …, 2008 - ieeexplore.ieee.org
In this paper, we explore the use of multi-band radio frequency interconnect (or RF-I) with
signal propagation at the speed of light to provide shortcuts in a many core network-on-chip …

Warped-slicer: Efficient intra-SM slicing through dynamic resource partitioning for GPU multiprogramming

Q Xu, H Jeon, K Kim, WW Ro… - ACM SIGARCH Computer …, 2016 - dl.acm.org
As technology scales, GPUs are forecasted to incorporate an ever-increasing amount of
computing resources to support thread-level parallelism. But even with the best effort …

An analysis of on-chip interconnection networks for large-scale chip multiprocessors

D Sanchez, G Michelogiannakis… - ACM Transactions on …, 2010 - dl.acm.org
With the number of cores of chip multiprocessors (CMPs) rapidly growing as technology
scales down, connecting the different components of a CMP in a scalable and efficient way …

In-network cache coherence

N Eisley, LS Peh, L Shang - 2006 39th Annual IEEE/ACM …, 2006 - ieeexplore.ieee.org
With the trend towards increasing number of processor cores in future chip architectures,
scalable directory-based protocols for maintaining cache coherence will be needed …

Swizzle-switch networks for many-core systems

K Sewell, RG Dreslinski, T Manville… - IEEE Journal on …, 2012 - ieeexplore.ieee.org
This work revisits the design of crossbar and high-radix interconnects in light of advances in
circuit and layout techniques that improve crossbar scalability, obviating the need for deep …

[KNIHA][B] Multi-core cache hierarchies

R Balasubramonian, NP Jouppi, N Muralimanohar - 2011 - books.google.com
A key determinant of overall system performance and power dissipation is the cache
hierarchy since access to off-chip memory consumes many more cycles and energy than on …

A heterogeneous multiple network-on-chip design: an application-aware approach

AK Mishra, O Mutlu, CR Das - Proceedings of the 50th annual design …, 2013 - dl.acm.org
Current network-on-chip designs in chip-multiprocessors are agnostic to application
requirements and hence are provisioned for the general case, leading to wasted energy and …

Scalability of broadcast performance in wireless network-on-chip

S Abadal, A Mestres, M Nemirovsky… - … on Parallel and …, 2016 - ieeexplore.ieee.org
Networks-on-Chip (NoCs) are currently the paradigm of choice to interconnect the cores of a
chip multiprocessor. However, conventional NoCs may not suffice to fulfill the on-chip …