Aladdin: A pre-rtl, power-performance accelerator simulator enabling large design space exploration of customized architectures

YS Shao, B Reagen, GY Wei, D Brooks - ACM SIGARCH Computer …, 2014 - dl.acm.org
Hardware specialization, in the form of accelerators that provide custom datapath and
control for specific algorithms and applications, promises impressive performance and …

HELIX: Automatic parallelization of irregular programs for chip multiprocessing

S Campanoni, T Jones, G Holloway, VJ Reddi… - Proceedings of the …, 2012 - dl.acm.org
We describe and evaluate HELIX, a new technique for automatic loop parallelization that
assigns successive iterations of a loop to separate threads. We show that the inter-thread …

ISA-independent workload characterization and its implications for specialized architectures

YS Shao, D Brooks - … on Performance Analysis of Systems and …, 2013 - ieeexplore.ieee.org
Specialized architectures will become increasingly important as the computing industry
demands more energy-efficient designs. The application-centric design style for these …

HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs

S Campanoni, K Brownell, S Kanev, TM Jones… - ACM SIGARCH …, 2014 - dl.acm.org
Data dependences in sequential programs limit parallelization because extracted threads
cannot run independently. Although thread-level speculation can avoid the need for precise …

HELIX-UP: Relaxing program semantics to unleash parallelization

S Campanoni, G Holloway, GY Wei… - 2015 IEEE/ACM …, 2015 - ieeexplore.ieee.org
Automatic generation of parallel code for general-purpose commodity processors is a
challenging computational problem. Nevertheless, there is a lot of latent thread-level …

Achieving consistent and comparable CPU evaluation outcomes

C Wang, L Wang, W Gao, Y Yang, Y Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
The SPEC CPU2017 benchmark suite is an industry standard for accessing CPU
performance. It adheres strictly to some workload and system configurations-arbitrary …

The helix project: Overview and directions

S Campanoni, T Jones, G Holloway, GY Wei… - Proceedings of the 49th …, 2012 - dl.acm.org
Parallelism has become the primary way to maximize processor performance and power
efficiency. But because creating parallel programs by hand is difficult and prone to error …

HELIX: Making the extraction of thread-level parallelism mainstream

S Campanoni, TM Jones, G Holloway, GY Wei… - IEEE Micro, 2012 - ieeexplore.ieee.org
Improving system performance increasingly depends on exploiting microprocessor
parallelism, yet mainstream compilers still don't parallelize code automatically. Helix …

Auto-hpcnet: An automatic framework to build neural network-based surrogate for high-performance computing applications

W Dong, G Kestor, D Li - … of the 32nd International Symposium on High …, 2023 - dl.acm.org
High-performance computing communities are increasingly adopting Neural Networks (NN)
as surrogate models in their applications to generate scientific insights. Replacing an …

WPC: whole-picture workload characterization

L Wang, K Yang, C Wang, W Gao, C Luo… - arxiv preprint arxiv …, 2023 - arxiv.org
This article raises an important and challenging workload characterization issue: can we
uncover each critical component across the stacks contributing what percentages to any …