An approach for quantitative analysis of application-specific dataflow architectures

B Kienhuis, E Deprettere, K Vissers… - Proceedings IEEE …, 1997 - ieeexplore.ieee.org
In this paper we present an approach for quantitative analysis of application-specific
dataflow architectures. The approach allows the designer to rate design alternatives in a …

[LIBRO][B] The compiler design handbook: optimizations and machine code generation

YN Srikant, P Shankar - 2002 - taylorfrancis.com
The widespread use of object-oriented languages and Internet security concerns are just the
beginning. Add embedded systems, multiple memory banks, highly pipelined units …

A low-cost approach towards mixed task and data parallel scheduling

A Radulescu, AJC Van Gemund - International Conference on …, 2001 - ieeexplore.ieee.org
A relatively new trend in parallel programming scheduling is the so-called mixed task and
data scheduling. It has been shown that mixing task and data parallelism to solve large …

CPR: Mixed task and data parallel scheduling for distributed systems

A Radulescu, C Nicolescu… - … 15th International Parallel …, 2001 - ieeexplore.ieee.org
It is well-known that mixing task and data parallelism to solve large computational
applications often yields better speedups compared to either applying pure task parallelism …

Single-dimension software pipelining for multidimensional loops

H Rong, Z Tang, R Govindarajan, A Douillet… - ACM Transactions on …, 2007 - dl.acm.org
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest
or from the innermost loop to outer loops. This paper proposes a three-step approach, called …

The loop parallelizer LooPo—announcement

M Griebl, C Lengauer - … Workshop on Languages and Compilers for …, 1996 - Springer
LooPo is a new loop parallelizing framework developed at the University of Passau to aid us
in research on the space-time map** of loop nests. LooPo is available on the Web [11] …

Analysis and testing for error tolerant motion estimation

H Chung, A Ortega - … on Defect and Fault Tolerance in VLSI …, 2005 - ieeexplore.ieee.org
We propose a novel system-level error tolerance approach specifically targeted for
multimedia compression algorithms. In particular, we focus on the motion estimation process …

Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs

A Darte, F Vivien - International Journal of Parallel Programming, 1997 - Springer
This paper presents an optimal algorithm for detecting line or medium grain parallelism in
nested loops whose dependences are described by an approximation of distance vectors by …

Affine-by-statement scheduling of uniform and affine loop nests over parametric domains

A Darte, Y Robert - Journal of Parallel and Distributed Computing, 1995 - Elsevier
This paper deals with parallel scheduling techniques for uniform and affine loop nests. We
deal with affine-by-statement scheduling, a powerful extension of Lamport′ s hyperplane …

Scheduling of partitioned regular algorithms on processor arrays with constrained resources

J Teich, L Thiele, L Zhang - Proceedings of International …, 1996 - ieeexplore.ieee.org
A single integer linear programming model for optimally scheduling partitioned regular
algorithms is presented. The herein presented methodology differs from existing methods in …