A survey of machine learning for computer architecture and systems

N Wu, Y **e - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
It has been a long time that computer architecture and systems are optimized for efficient
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …

Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation

TE Carlson, W Heirman, L Eeckhout - Proceedings of 2011 International …, 2011 - dl.acm.org
Two major trends in high-performance computing, namely, larger numbers of cores and the
growing size of on-chip cache memory, are creating significant challenges for evaluating the …

{DeepDive}: Transparently identifying and managing performance interference in virtualized environments

D Novaković, N Vasić, S Novaković, D Kostić… - 2013 USENIX Annual …, 2013 - usenix.org
We describe the design and implementation of DeepDive, a system for transparently
identifying and managing performance interference between virtual machines (VMs) co …

Caloree: Learning control for predictable latency and low energy

N Mishra, C Imes, JD Lafferty, H Hoffmann - ACM SIGPLAN Notices, 2018 - dl.acm.org
Many modern computing systems must provide reliable latency with minimal energy. Two
central challenges arise when allocating system resources to meet these conflicting …

Interval simulation: Raising the level of abstraction in architectural simulation

D Genbrugge, S Eyerman… - HPCA-16 2010 The …, 2010 - ieeexplore.ieee.org
Detailed architectural simulators suffer from a long development cycle and extremely long
evaluation times. This longstanding problem is further exacerbated in the multi-core …

A probabilistic graphical model-based approach for minimizing energy under performance constraints

N Mishra, H Zhang, JD Lafferty… - ACM SIGARCH Computer …, 2015 - dl.acm.org
In many deployments, computer systems are underutilized--meaning that applications have
performance requirements that demand less than full system capacity. Ideally, we would …

Apparatus and method for optimizing quantifiable behavior in configurable devices and systems

H Hoffmann, J Lafferty, N Mishra - US Patent 11,009,836, 2021 - Google Patents
This disclosure relates to a method and apparatus for selecting a computational
configuration of an electronic device executing an application in order to improve energy …

Performance interfaces for hardware accelerators

J Ma, R Iyer, S Kashani, M Emami, T Bourgeat… - … USENIX Symposium on …, 2024 - usenix.org
Designing and building a system that reaps the performance benefits of hardware
accelerators is challenging, because they provide little concrete visibility into their expected …

Towards compositionality in execution time analysis: definition and challenges

S Hahn, J Reineke, R Wilhelm - ACM SIGBED Review, 2015 - dl.acm.org
For hard real-time systems, timeliness of operations has to be guaranteed. Static timing
analysis is therefore employed to compute upper bounds on the execution times of a …

Flicker: A dynamically adaptive architecture for power limited multicore systems

P Petrica, AM Izraelevitz, DH Albonesi… - Proceedings of the 40th …, 2013 - dl.acm.org
Future microprocessors may become so power constrained that not all transistors will be
able to be powered on at once. These systems will be required to nimbly adapt to changes …