Scalable work stealing
Irregular and dynamic parallel applications pose significant challenges to achieving
scalable performance on large-scale multicore clusters. These applications often require …
scalable performance on large-scale multicore clusters. These applications often require …
Dynamic circular work-stealing deque
D Chase, Y Lev - Proceedings of the seventeenth annual ACM …, 2005 - dl.acm.org
The non-blocking work-stealing algorithm of Arora, Blumofe, and Plaxton (henceforth ABP
work-stealing) is on its way to becoming the multiprocessor load balancing technology of …
work-stealing) is on its way to becoming the multiprocessor load balancing technology of …
Scheduling parallel programs by work stealing with private deques
Work stealing has proven to be an effective method for scheduling parallel programs on
multicore computers. To achieve high performance, work stealing distributes tasks between …
multicore computers. To achieve high performance, work stealing distributes tasks between …
Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated
Building correct and efficient concurrent algorithms is known to be a difficult problem of
fundamental importance. To achieve efficiency, designers try to remove unnecessary and …
fundamental importance. To achieve efficiency, designers try to remove unnecessary and …
OpenMP task scheduling strategies for multicore NUMA systems
The recent addition of task parallelism to the OpenMP shared memory API allows
programmers to express concurrency at a high level of abstraction and places the burden of …
programmers to express concurrency at a high level of abstraction and places the burden of …
Idempotent work stealing
Load balancing is a technique which allows efficient parallelization of irregular workloads,
and a key component of many applications and parallelizing runtimes. Work-stealing is a …
and a key component of many applications and parallelizing runtimes. Work-stealing is a …
Adaptive work-stealing with parallelism feedback
Multiprocessor scheduling in a shared multiprogramming environment can be structured as
two-level scheduling, where a kernel-level job scheduler allots processors to jobs and a …
two-level scheduling, where a kernel-level job scheduler allots processors to jobs and a …
Lace: non-blocking split deque for work-stealing
Work-stealing is an efficient method to implement load balancing in fine-grained task
parallelism. Typically, concurrent deques are used for this purpose. A disadvantage of many …
parallelism. Typically, concurrent deques are used for this purpose. A disadvantage of many …
Read/write fence-free work-stealing with multiplicity
It has been shown that any nonblocking algorithm for work-stealing in the standard
asynchronous shared memory model of computation must use expensive Read-After-Write …
asynchronous shared memory model of computation must use expensive Read-After-Write …
Real-time ray tracing through the eyes of a game developer
J Bikker - 2007 IEEE Symposium on Interactive Ray Tracing, 2007 - ieeexplore.ieee.org
There has been and is a tremendous amount of research on the topic of ray tracing, spurred
by the relatively recent advent of real-time ray tracing and the inevitable appearance of …
by the relatively recent advent of real-time ray tracing and the inevitable appearance of …