Argobots: A lightweight low-level threading and tasking framework

S Seo, A Amer, P Balaji, C Bordage… - … on Parallel and …, 2017 - ieeexplore.ieee.org
In the past few decades, a number of user-level threading and tasking models have been
proposed in the literature to address the shortcomings of OS-level threads, primarily with …

Understanding and optimizing persistent memory allocation

W Cai, H Wen, HA Beadle, C Kjellqvist… - Proceedings of the …, 2020 - dl.acm.org
The proliferation of fast, dense, byte-addressable nonvolatile memory suggests that data
might be kept in pointer-rich" in-memory" format across program runs and even process and …

ScatterAlloc: Massively parallel dynamic memory allocation for the GPU

M Steinberger, M Kenzel, B Kainz… - 2012 Innovative …, 2012 - ieeexplore.ieee.org
In this paper, we analyze the special requirements of a dynamic memory allocator that is
designed for massively parallel architectures such as Graphics Processing Units (GPUs) …

Constructing neuronal network models in massively parallel environments

T Ippen, JM Eppler, HE Plesser… - Frontiers in …, 2017 - frontiersin.org
Recent advances in the development of data structures to represent spiking neuron network
models enable us to exploit the complete memory of petascale computers for a single brain …

Releasing memory with optimistic access: A hybrid approach to memory reclamation and allocation in lock-free programs

P Moreno, R Rocha - Proceedings of the 35th ACM Symposium on …, 2023 - dl.acm.org
Lock-free data structures are an important tool for the development of concurrent programs
as they provide scalability, low latency and avoid deadlocks, livelocks and priority inversion …

Memory at your service: Fast memory allocation for latency-critical services

A Pi, J Zhao, S Wang, X Zhou - Proceedings of the 22nd International …, 2021 - dl.acm.org
Co-location and memory sharing between latency-critical services, such as key-value store
and web search, and best-effort batch jobs is an appealing approach to improving memory …

SSMalloc: a low-latency, locality-conscious memory allocator with stable performance scalability

R Liu, H Chen - Proceedings of the Asia-Pacific Workshop on Systems, 2012 - dl.acm.org
Allocation latency, access locality and performance scalability are three key factors affecting
the efficiency of a memory allocator for many cores. However, many previous state-of-the-art …

Register efficient dynamic memory allocator for GPUs

M Vinkler, V Havran - Computer Graphics Forum, 2015 - Wiley Online Library
We compare five existing dynamic memory allocators optimized for GPUs and show their
strengths and weaknesses. In the measurements, we use three generic evaluation tests …

Performance implications of dynamic memory allocators on transactional memory systems

A Baldassin, E Borin, G Araujo - Proceedings of the 20th ACM SIGPLAN …, 2015 - dl.acm.org
Although dynamic memory management accounts for a significant part of the execution time
on many modern software systems, its impact on the performance of transactional memory …

Towards Efficient Cache Allocation for High-Frequency Checkpointing

A Maurya, B Nicolae, MM Rafique… - 2022 IEEE 29th …, 2022 - ieeexplore.ieee.org
While many HPC applications are known to have long runtimes, this is not always because
of single large runs: in many cases, this is due to ensembles composed of many short runs …