Kokkos: Enabling manycore performance portability through polymorphic memory access patterns

HC Edwards, CR Trott, D Sunderland - Journal of parallel and distributed …, 2014 - Elsevier
The manycore revolution can be characterized by increasing thread counts, decreasing
memory per thread, and diversity of continually evolving manycore architectures. High …

Taskflow: A lightweight parallel and heterogeneous task graph computing system

TW Huang, DL Lin, CX Lin, Y Lin - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Taskflow aims to streamline the building of parallel and heterogeneous applications using a
lightweight task graph-based approach. Taskflow introduces an expressive task graph …

Personal data lake with data gravity pull

C Walker, H Alrehamy - … Conference on Big Data and Cloud …, 2015 - ieeexplore.ieee.org
This paper presents Personal Data Lake, a unified storage facility for storing, analyzing and
querying personal data. A data lake stores data regardless of format and thus provides an …

Multinode multi-GPU two-electron integrals: Code generation using the regent language

KG Johnson, S Mirchandaney, E Hoag… - Journal of Chemical …, 2022 - ACS Publications
The computation of two-electron repulsion integrals (ERIs) is often the most expensive step
of integral-direct self-consistent field methods. Formally it scales as O (N 4), where N is the …

[PDF][PDF] Collaboro: a collaborative (meta) modeling tool

JLC Izquierdo, J Cabot - PeerJ Computer Science, 2016 - peerj.com
Motivation Scientists increasingly rely on intelligent information systems to help them in their
daily tasks, in particular for managing research objects, like publications or datasets. The …

Significance driven computation: a voltage-scalable, variation-aware, quality-tuning motion estimator

D Mohapatra, G Karakonstantis, K Roy - Proceedings of the 2009 ACM …, 2009 - dl.acm.org
In this paper we present a design methodology for algorithm/architecture co-design of a
voltage-scalable, process variation aware motion estimator based on significance driven …

Task-parallel programming with constrained parallelism

TW Huang, L Hwang - 2022 IEEE High Performance Extreme …, 2022 - ieeexplore.ieee.org
Task graph programming model (TGPM) has become central to a wide range of scientific
computing applications because it enables top-down optimization of parallelism that …

[HTML][HTML] Sigmoid: An auto-tuned load balancing algorithm for heterogeneous systems

B Pérez, E Stafford, JL Bosque, R Beivide - Journal of Parallel and …, 2021 - Elsevier
A challenge that heterogeneous system programmers face is leveraging the performance of
all the devices that integrate the system. This paper presents Sigmoid, a new load balancing …

Simplifying programming and load balancing of data parallel applications on heterogeneous systems

B Pérez, JL Bosque, R Beivide - … of the 9th Annual Workshop on General …, 2016 - dl.acm.org
Heterogeneous architectures have experienced a great development thanks to their
excellent cost/performance ratio and low power consumption. But heterogeneity significantly …

EngineCL: usability and performance in heterogeneous computing

R Nozal, JL Bosque, R Beivide - Future Generation Computer Systems, 2020 - Elsevier
Heterogeneous systems have become one of the most common architectures today, thanks
to their excellent performance and energy consumption. However, due to their heterogeneity …