A mechanistic performance model for superscalar out-of-order processors

S Eyerman, L Eeckhout, T Karkhanis… - ACM Transactions on …, 2009 - dl.acm.org
A mechanistic model for out-of-order superscalar processors is developed and then applied
to the study of microarchitecture resource scaling. The model divides execution time into …

Distributed microarchitectural protocols in the TRIPS prototype processor

K Sankaralingam, R Nagarajan… - 2006 39th Annual …, 2006 - ieeexplore.ieee.org
Growing on-chip wire delays will cause many future microarchitectures to be distributed, in
which hardware resources within a single processor become nodes on one or more …

Invisifence: performance-transparent memory ordering in conventional multiprocessors

C Blundell, MMK Martin, TF Wenisch - Proceedings of the 36th annual …, 2009 - dl.acm.org
A multiprocessor's memory consistency model imposes ordering constraints among loads,
stores, atomic operations, and memory fences. Even for consistency models that relax …

DeSC: Decoupled supply-compute communication management for heterogeneous architectures

TJ Ham, JL Aragón, M Martonosi - Proceedings of the 48th International …, 2015 - dl.acm.org
Today's computers employ significant heterogeneity to meet performance targets at
manageable power. In adopting increased compute specialization, however, the relative …

Redefining the Role of the CPU in the Era of CPU-GPU Integration

M Arora, S Nath, S Mazumdar, SB Baden… - IEEE Micro, 2012 - ieeexplore.ieee.org
We've seen the quick adoption of GPUs as general-purpose computing engines in recent
years, fueled by high computational throughput and energy efficiency. There is heavier …

[LIBRO][B] Multithreading architecture

M Nemirovsky, D Tullsen - 2022 - books.google.com
Multithreaded architectures now appear across the entire range of computing devices, from
the highest-performing general purpose devices to low-end embedded processors …

Kilo-instruction processors: Overcoming the memory wall

A Cristal, OJ Santana, F Cazorla, M Galluzzi… - IEEE micro, 2005 - ieeexplore.ieee.org
Historically, advances in integrated circuit technology have driven improvements in
processor microarchitecture and led to todays microprocessors with sophisticated pipelines …

iCFP: Tolerating all-level cache misses in in-order processors

A Hilton, S Nagarakatte, A Roth - 2009 IEEE 15th International …, 2009 - ieeexplore.ieee.org
Growing concerns about power have revived interest in in-order pipelines. In-order pipelines
sacrifice single-thread performance. Specifically, they do not allow execution to flow freely …

Non-speculative load-load reordering in tso

A Ros, TE Carlson, M Alipour, S Kaxiras - ACM SIGARCH Computer …, 2017 - dl.acm.org
In Total Store Order memory consistency (TSO), loads can be speculatively reordered to
improve performance. If a load-load reordering is seen by other cores, speculative loads …

Long term parking (ltp) criticality-aware resource allocation in ooo processors

A Sembrant, T Carlson, E Hagersten… - Proceedings of the 48th …, 2015 - dl.acm.org
Modern processors employ large structures (IQ, LSQ, register file, etc.) to expose instruction-
level parallelism (ILP) and memory-level parallelism (MLP). These resources are typically …