Multilevel algorithms for acyclic partitioning of directed acyclic graphs
We investigate the problem of partitioning the vertices of a directed acyclic graph into a
given number of parts. The objective function is to minimize the number or the total weight of …
given number of parts. The objective function is to minimize the number or the total weight of …
Time complexity of in-memory solution of linear systems
In-memory computing (IMC) with cross-point resistive memory arrays has been shown to
accelerate data-centric computations, such as the training and inference of deep neural …
accelerate data-centric computations, such as the training and inference of deep neural …
Pebbles, graphs, and a pinch of combinatorics: Towards tight I/O lower bounds for statically analyzable programs
Determining I/O lower bounds is a crucial step in obtaining communication-efficient parallel
algorithms, both across the memory hierarchy and between processors. Current approaches …
algorithms, both across the memory hierarchy and between processors. Current approaches …
Parallel Loop Locality Analysis for Symbolic Thread Counts
Data movement limits program performance. This bottleneck is more significant in multi-
thread programs but more difficult to analyze, especially for multiple thread counts. For …
thread programs but more difficult to analyze, especially for multiple thread counts. For …
Formal Verification of Source-to-Source Transformations for HLS
High-level synthesis (HLS) can greatly facilitate the description of complex hardware
implementations, by raising the level of abstraction up to a classical imperative language …
implementations, by raising the level of abstraction up to a classical imperative language …
Acyclic partitioning of large directed acyclic graphs
Finding a good partition of a computational directed acyclic graph associated with an
algorithm can help find an execution pattern improving data locality, conduct an analysis of …
algorithm can help find an execution pattern improving data locality, conduct an analysis of …
IOOpt: automatic derivation of i/o complexity bounds for affine programs
A Olivry, G Iooss, N Tollenaere, A Rountev… - Proceedings of the …, 2021 - dl.acm.org
Evaluating the complexity of an algorithm is an important step when develo**
applications, as it impacts both its time and energy performance. Computational complexity …
applications, as it impacts both its time and energy performance. Computational complexity …
Automated derivation of parametric data movement lower bounds for affine programs
Researchers and practitioners have for long worked on improving the computational
complexity of algorithms, focusing on reducing the number of operations needed to perform …
complexity of algorithms, focusing on reducing the number of operations needed to perform …
Automatic Hardware Pragma Insertion in High-Level Synthesis: A Non-Linear Programming Approach
High-level synthesis, source-to-source compilers, and various Design Space Exploration
techniques for pragma insertion have significantly improved the Quality of Results of …
techniques for pragma insertion have significantly improved the Quality of Results of …
Brief Announcement: Red-Blue Pebbling with Multiple Processors: Time, Communication and Memory Trade-offs
The well-studied red-blue pebble game models the execution of an arbitrary computational
DAG by a single processor over a two-level memory hierarchy. We present a natural …
DAG by a single processor over a two-level memory hierarchy. We present a natural …