Scalable work stealing

J Dinan, DB Larkins, P Sadayappan… - Proceedings of the …, 2009 - dl.acm.org
Irregular and dynamic parallel applications pose significant challenges to achieving
scalable performance on large-scale multicore clusters. These applications often require …

Dynamic circular work-stealing deque

D Chase, Y Lev - Proceedings of the seventeenth annual ACM …, 2005 - dl.acm.org
The non-blocking work-stealing algorithm of Arora, Blumofe, and Plaxton (henceforth ABP
work-stealing) is on its way to becoming the multiprocessor load balancing technology of …

Scheduling parallel programs by work stealing with private deques

UA Acar, A Charguéraud, M Rainey - Proceedings of the 18th ACM …, 2013 - dl.acm.org
Work stealing has proven to be an effective method for scheduling parallel programs on
multicore computers. To achieve high performance, work stealing distributes tasks between …

Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated

H Attiya, R Guerraoui, D Hendler, P Kuznetsov… - ACM SIGPLAN …, 2011 - dl.acm.org
Building correct and efficient concurrent algorithms is known to be a difficult problem of
fundamental importance. To achieve efficiency, designers try to remove unnecessary and …

OpenMP task scheduling strategies for multicore NUMA systems

SL Olivier, AK Porterfield, KB Wheeler… - … Journal of High …, 2012 - journals.sagepub.com
The recent addition of task parallelism to the OpenMP shared memory API allows
programmers to express concurrency at a high level of abstraction and places the burden of …

Idempotent work stealing

MM Michael, MT Vechev, VA Saraswat - Proceedings of the 14th ACM …, 2009 - dl.acm.org
Load balancing is a technique which allows efficient parallelization of irregular workloads,
and a key component of many applications and parallelizing runtimes. Work-stealing is a …

Adaptive work-stealing with parallelism feedback

K Agrawal, CE Leiserson, Y He, WJ Hsu - ACM Transactions on …, 2008 - dl.acm.org
Multiprocessor scheduling in a shared multiprogramming environment can be structured as
two-level scheduling, where a kernel-level job scheduler allots processors to jobs and a …

Lace: non-blocking split deque for work-stealing

T van Dijk, JC van de Pol - Euro-Par 2014: Parallel Processing Workshops …, 2014 - Springer
Work-stealing is an efficient method to implement load balancing in fine-grained task
parallelism. Typically, concurrent deques are used for this purpose. A disadvantage of many …

Read/write fence-free work-stealing with multiplicity

A Castañeda, M Piña - Journal of Parallel and Distributed Computing, 2024 - Elsevier
It has been shown that any nonblocking algorithm for work-stealing in the standard
asynchronous shared memory model of computation must use expensive Read-After-Write …

Real-time ray tracing through the eyes of a game developer

J Bikker - 2007 IEEE Symposium on Interactive Ray Tracing, 2007 - ieeexplore.ieee.org
There has been and is a tremendous amount of research on the topic of ray tracing, spurred
by the relatively recent advent of real-time ray tracing and the inevitable appearance of …