System, method, and computer program product for improving memory systems

MS Smith - US Patent 9,432,298, 2016 - Google Patents
H01L25/18—Assemblies consisting of a plurality of individual semiconductor or other solid
state devices; Multistep manufacturing processes thereof the devices being of types …

Blockhammer: Preventing rowhammer at low cost by blacklisting rapidly-accessed dram rows

AG Yağlikçi, M Patel, JS Kim, R Azizi… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Aggressive memory density scaling causes modern DRAM devices to suffer from
RowHammer, a phenomenon where rapidly activating (ie, hammering) a DRAM row can …

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

A case for exploiting subarray-level parallelism (SALP) in DRAM

Y Kim, V Seshadri, D Lee, J Liu, O Mutlu - ACM SIGARCH Computer …, 2012 - dl.acm.org
Modern DRAMs have multiple banks to serve multiple memory requests in parallel.
However, when two requests go to the same bank, they have to be served serially …

Memory scaling: A systems architecture perspective

O Mutlu - 2013 5th IEEE International Memory Workshop, 2013 - ieeexplore.ieee.org
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …

OWL: Cooperative thread array aware scheduling techniques for improving GPGPU performance

A Jog, O Kayiran, N Chidambaram Nachiappan… - ACM SIGPLAN …, 2013 - dl.acm.org
Emerging GPGPU architectures, along with programming models like CUDA and OpenCL,
offer a cost-effective platform for many applications by providing high thread level …

Tiered-latency DRAM: A low latency and low cost DRAM architecture

D Lee, Y Kim, V Seshadri, J Liu… - 2013 IEEE 19th …, 2013 - ieeexplore.ieee.org
The capacity and cost-per-bit of DRAM have historically scaled to satisfy the needs of
increasingly large and complex computer systems. However, DRAM latency has remained …

Syncron: Efficient synchronization support for near-data-processing architectures

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

[PDF][PDF] Research problems and opportunities in memory systems

O Mutlu, L Subramanian - Supercomputing frontiers and …, 2014 - superfri.susu.ru
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …

Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems

R Ausavarungnirun, KKW Chang… - ACM SIGARCH …, 2012 - dl.acm.org
When multiple processor (CPU) cores and a GPU integrated together on the same chip
share the off-chip main memory, requests from the GPU can heavily interfere with requests …