Program optimization space pruning for a multithreaded GPU

S Ryoo, CI Rodrigues, SS Stone… - Proceedings of the 6th …, 2008 - dl.acm.org
Program optimization for highly-parallel systems has historically been considered an art,
with experts doing much of the performance tuning by hand. With the introduction of …

SUIF Explorer: an interactive and interprocedural parallelizer

SW Liao, A Diwan, RP Bosch Jr, A Ghuloum… - Proceedings of the …, 1999 - dl.acm.org
The SUIF Explorer is an interactive parallelization tool that is more effective than previous
systems in minimizing the number of lines of code that require programmer assistance. First …

[PDF][PDF] Gated SSA-based demand-driven symbolic analysis for parallelizing compilers

P Tu, D Padua - Proceedings of the 9th International Conference on …, 1995 - dl.acm.org
In this paper, we present a GSA-based technique that performs more efficient and more
precise symbolic analysis of predicated assignments, recurrences and index arrays. The …

Detecting coarse-grain parallelism using an interprocedural parallelizing compiler

MH Hall, SP Amarasinghe, BR Murphy… - Proceedings of the …, 1995 - dl.acm.org
This paper presents an extensive empirical evaluation of an interprocedural parallelizing
compiler, developed as part of the Stanford SUIF compiler system. The system incorporates …

Compiler optimizations for eliminating barrier synchronization

CW Tseng - ACM SIGPLAN Notices, 1995 - dl.acm.org
This paper presents novel compiler optimizations for reducing synchronization overhead in
compiler-parallelized scientific codes. A hybrid programming model is employed to combine …

The range test: a dependence test for symbolic, non-linear expressions

W Blume, R Eigenmann - … '94: Proceedings of the 1994 ACM …, 1994 - ieeexplore.ieee.org
Most current data dependence tests cannot handle loop bounds or array subscripts that are
symbolic, nonlinear expressions (eg A (n* i+ j), where 0/spl les/j/spl les/n). We describe a …

Perspective: A sensible approach to speculative automatic parallelization

S Apostolakis, Z Xu, G Chan, S Campanoni… - Proceedings of the …, 2020 - dl.acm.org
The promise of automatic parallelization, freeing programmers from the error-prone and time-
consuming process of making efficient use of parallel processing resources, remains …

A study on popular auto‐parallelization frameworks

S Prema, R Nasre, R Jehadeesan… - Concurrency and …, 2019 - Wiley Online Library
We study five popular auto‐parallelization frameworks (Cetus, Par4all, Rose, ICC, and
Pluto) and compare them qualitatively as well as quantitatively. All the frameworks primarily …

Efficient building and placing of gating functions

P Tu, D Padua - Proceedings of the ACM SIGPLAN 1995 conference on …, 1995 - dl.acm.org
In this paper, we present an almost-linear time algorithm for constructing Gated Single
Assignment (GSA), which is SSA augmented with gating functions at ø-nodes. The gating …

Symbolic range propagation

W Blume, R Eigenmann - Proceedings of 9th International …, 1995 - ieeexplore.ieee.org
Many analyses and transformations in a parallelizing compiler can benefit from the ability to
compare arbitrary symbolic expressions. In this paper, we describe how one can compare …