DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

Understanding reduced-voltage operation in modern DRAM devices: Experimental characterization, analysis, and mechanisms

KK Chang, AG Yağlıkçı, S Ghose, A Agrawal… - Proceedings of the …, 2017 - dl.acm.org
The energy consumption of DRAM is a critical concern in modern computing systems.
Improvements in manufacturing process technology have allowed DRAM vendors to lower …

Memory scaling: A systems architecture perspective

O Mutlu - 2013 5th IEEE International Memory Workshop, 2013 - ieeexplore.ieee.org
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …

[PDF][PDF] Research problems and opportunities in memory systems

O Mutlu, L Subramanian - Supercomputing frontiers and …, 2014 - superfri.susu.ru
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …

Enabling interposer-based disintegration of multi-core processors

A Kannan, NE Jerger, GH Loh - … of the 48th international symposium on …, 2015 - dl.acm.org
Silicon interposers enable the integration of multiple stacks of in-package memory to provide
higher bandwidth or lower energy for memory accesses. Once the interposer has been paid …

Mosaic: a GPU memory manager with application-transparent support for multiple page sizes

R Ausavarungnirun, J Landgraf, V Miller… - Proceedings of the 50th …, 2017 - dl.acm.org
Contemporary discrete GPUs support rich memory management features such as virtual
memory and demand paging. These features simplify GPU programming by providing a …

Memscale: active low-power modes for main memory

Q Deng, D Meisner, L Ramos, TF Wenisch… - ACM SIGPLAN …, 2011 - dl.acm.org
Main memory is responsible for a large and increasing fraction of the energy consumed by
servers. Prior work has focused on exploiting DRAM low-power states to conserve energy …

Load value approximation

J San Miguel, M Badr, NE Jerger - 2014 47th Annual IEEE …, 2014 - ieeexplore.ieee.org
Approximate computing explores opportunities that emerge when applications can tolerate
error or inexactness. These applications, which range from multimedia processing to …

CHIPPER: A low-complexity bufferless deflection router

C Fallin, C Craik, O Mutlu - 2011 IEEE 17th International …, 2011 - ieeexplore.ieee.org
As Chip Multiprocessors (CMPs) scale to tens or hundreds of nodes, the interconnect
becomes a significant factor in cost, energy consumption and performance. Recent work has …

Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency

R Ausavarungnirun, V Miller, J Landgraf… - ACM SIGPLAN …, 2018 - dl.acm.org
Graphics Processing Units (GPUs) exploit large amounts of threadlevel parallelism to
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …