A survey of techniques for architecting and managing asymmetric multicore processors

S Mittal - ACM Computing Surveys (CSUR), 2016 - dl.acm.org
To meet the needs of a diverse range of workloads, asymmetric multicore processors
(AMPs) have been proposed, which feature cores of different microarchitecture or ISAs …

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

Scheduling techniques for GPU architectures with processing-in-memory capabilities

A Pattnaik, X Tang, A Jog, O Kayiran… - Proceedings of the …, 2016 - dl.acm.org
Processing data in or near memory (PIM), as opposed to in conventional computational units
in a processor, can greatly alleviate the performance and energy penalties of data transfers …

Syncron: Efficient synchronization support for near-data-processing architectures

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

[PDF][PDF] Research problems and opportunities in memory systems

O Mutlu, L Subramanian - Supercomputing frontiers and innovations, 2014 - superfri.org
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …

Dimm-link: Enabling efficient inter-dimm communication for near-memory processing

Z Zhou, C Li, F Yang, G Sun - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
DIMM-based near-memory processing architectures (DIMM-NMP) have received growing
interest from both academia and industry. They have the advantages of large memory …

Event-based scheduling for energy-efficient qos (eqos) in mobile web applications

Y Zhu, M Halpern, VJ Reddi - 2015 IEEE 21st International …, 2015 - ieeexplore.ieee.org
Mobile Web applications have become an integral part of our society. They pose a high
demand for application quality of service (QoS). However, the energy-constrained nature of …

Understanding and improving the latency of DRAM-based memory systems

KK Chang - 2017 - search.proquest.com
Over the past two decades, the storage capacity and access bandwidth of main memory
have improved tremendously, by 128x and 20x, respectively. These improvements are …

Greedy combinatorial test case generation using unsatisfiable cores

A Yamada, A Biere, C Artho, T Kitamura… - Proceedings of the 31st …, 2016 - dl.acm.org
Combinatorial testing aims at covering the interactions of parameters in a system under test,
while some combinations may be forbidden by given constraints (forbidden tuples). In this …

Exploiting core criticality for enhanced GPU performance

A Jog, O Kayiran, A Pattnaik, MT Kandemir… - Proceedings of the …, 2016 - dl.acm.org
Modern memory access schedulers employed in GPUs typically optimize for memory
throughput. They implicitly assume that all requests from different cores are equally …