Shinjuku: Preemptive Scheduling for {μsecond-scale} Tail Latency
The recently proposed dataplanes for microsecond scale applications, such as IX and
ZygOS, use non-preemptive policies to schedule requests to cores. For the many real-world …
ZygOS, use non-preemptive policies to schedule requests to cores. For the many real-world …
Habanero-Java: the new adventures of old X10
In this paper, we present the Habanero-Java (HJ) language developed at Rice University as
an extension to the original Java-based definition of the X10 language. HJ includes a …
an extension to the original Java-based definition of the X10 language. HJ includes a …
Xkaapi: A runtime system for data-flow task programming on heterogeneous architectures
Most recent HPC platforms have heterogeneous nodes composed of multi-core CPUs and
accelerators, like GPUs. Programming such nodes is typically based on a combination of …
accelerators, like GPUs. Programming such nodes is typically based on a combination of …
Optimizing load balancing and data-locality with data-aware scheduling
Load balancing techniques (eg work stealing) are important to obtain the best performance
for distributed task scheduling systems that have multiple schedulers making scheduling …
for distributed task scheduling systems that have multiple schedulers making scheduling …
Personal data lake with data gravity pull
C Walker, H Alrehamy - … Conference on Big Data and Cloud …, 2015 - ieeexplore.ieee.org
This paper presents Personal Data Lake, a unified storage facility for storing, analyzing and
querying personal data. A data lake stores data regardless of format and thus provides an …
querying personal data. A data lake stores data regardless of format and thus provides an …
Understanding energy behaviors of thread management constructs
Java programmers are faced with numerous choices in managing concurrent execution on
multicore platforms. These choices often have different trade-offs (eg, performance …
multicore platforms. These choices often have different trade-offs (eg, performance …
[PDF][PDF] Hierarchical work stealing on manycore clusters
Abstract Partitioned Global Address Space languages like UPC offer a convenient way of
expressing large shared data structures, especially for irregular structures that require …
expressing large shared data structures, especially for irregular structures that require …
Customizable domain-specific computing
To meet computing needs and overcome power density limitations, the computing industry
has entered the era of parallelization. However, highly parallel, general-purpose computing …
has entered the era of parallelization. However, highly parallel, general-purpose computing …
Scalable and precise dynamic datarace detection for structured parallelism
Existing dynamic race detectors suffer from at least one of the following three limitations:(i)
space overhead per memory location grows linearly with the number of parallel threads [13] …
space overhead per memory location grows linearly with the number of parallel threads [13] …
Taskstream: Accelerating task-parallel workloads by recovering program structure
Reconfigurable accelerators, like CGRAs and dataflow architectures, have come to
prominence for addressing data-processing problems. However, they are largely limited to …
prominence for addressing data-processing problems. However, they are largely limited to …