Cloud computing landscape and research challenges regarding trust and reputation

SM Habib, S Ries, M Muhlhauser - 2010 7th International …, 2010 - ieeexplore.ieee.org
Cloud Computing is an emerging computing paradigm. It shares massively scalable, elastic
resources (eg, data, calculations, and services) transparently among the users over a …

Futhark: purely functional GPU-programming with nested parallelism and in-place array updates

T Henriksen, NGW Serup, M Elsman… - Proceedings of the 38th …, 2017 - dl.acm.org
Futhark is a purely functional data-parallel array language that offers a machine-neutral
programming model and an optimising compiler that generates OpenCL code for GPUs …

Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code

M Steuwer, C Fensch, S Lindley, C Dubach - ACM SIGPLAN Notices, 2015 - dl.acm.org
Computers have become increasingly complex with the emergence of heterogeneous
hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous …

A compiler for throughput optimization of graph algorithms on GPUs

S Pai, K **ali - Proceedings of the 2016 ACM SIGPLAN International …, 2016 - dl.acm.org
Writing high-performance GPU implementations of graph algorithms can be challenging. In
this paper, we argue that three optimizations called throughput optimizations are key to high …

Optimising purely functional GPU programs

TL McDonell, MMT Chakravarty, G Keller… - ACM SIGPLAN …, 2013 - dl.acm.org
Purely functional, embedded array programs are a good match for SIMD hardware, such as
GPUs. However, the naive compilation of such programs quickly leads to both code …

Incremental flattening for nested data parallelism

T Henriksen, F Thorøe, M Elsman… - Proceedings of the 24th …, 2019 - dl.acm.org
Compilation techniques for nested-parallel applications that can adapt to hardware and
dataset characteristics are vital for unlocking the power of modern hardware. This paper …

Dynamic thread block launch: A lightweight execution mechanism to support irregular applications on gpus

J Wang, N Rubin, A Sidelnik… - ACM SIGARCH Computer …, 2015 - dl.acm.org
GPUs have been proven effective for structured applications that map well to the rigid 1D-3D
grid of threads in modern bulk synchronous parallel (BSP) programming languages …

Laperm: Locality aware scheduler for dynamic parallelism on gpus

J Wang, N Rubin, A Sidelnik… - ACM SIGARCH Computer …, 2016 - dl.acm.org
Recent developments in GPU execution models and architectures have introduced dynamic
parallelism to facilitate the execution of irregular applications where control flow and …

Baechi: fast device placement of machine learning graphs

B Jeon, L Cai, P Srivastava, J Jiang, X Ke… - Proceedings of the 11th …, 2020 - dl.acm.org
Machine Learning graphs (or models) can be challenging or impossible to train when either
devices have limited memory, or the models are large. Splitting the model graph across …

Wireframe: Supporting data-dependent parallelism through dependency graph execution in gpus

AA Abdolrashidi, D Tripathy, ME Belviranli… - Proceedings of the 50th …, 2017 - dl.acm.org
GPUs lack fundamental support for data-dependent parallelism and synchronization. While
CUDA Dynamic Parallelism signals progress in this direction, many limitations and …