Syncron: Efficient synchronization support for near-data-processing architectures

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

Warp scheduling for fine-grained synchronization

A ElTantawy, TM Aamodt - 2018 IEEE International Symposium …, 2018 - ieeexplore.ieee.org
Fine-grained synchronization is employed in many parallel algorithms and is often
implemented using busy-wait synchronization (eg, spin locks). However, busy-wait …

Adaptive contention management for fine-grained synchronization on commodity GPUs

L Gao, J Wang, W Zhang - ACM Transactions on Architecture and Code …, 2022 - dl.acm.org
As more emerging applications are moving to GPUs, fine-grained synchronization has
become imperative. However, their performance can be severely impaired in case of …

Model Checking Embedded C Software Using k-Induction and Invariants

H Rocha, H Ismail, L Cordeiro, R Barreto - Embedded Software Verification …, 2017 - Springer
We present a novel proof by induction algorithm, which combines k-induction with invariants
to model check embedded C software with bounded and unbounded loops. The k-induction …

MiSAR: Minimalistic synchronization accelerator with resource overflow management

CK Liang, M Prvulovic - ACM SIGARCH Computer Architecture News, 2015 - dl.acm.org
While numerous hardware synchronization mechanisms have been proposed, they either
no longer function or suffer great performance loss when their hardware resources are …

Fast fine-grained global synchronization on GPUs

K Wang, D Fussell, C Lin - … of the Twenty-Fourth International Conference …, 2019 - dl.acm.org
This paper extends the reach of General Purpose GPU programming by presenting a
software architecture that supports efficient fine-grained synchronization over global …

Accelerating Irregular Applications via Efficient Synchronization and Data Access Techniques

C Giannoula - arxiv preprint arxiv:2211.05908, 2022 - arxiv.org
Irregular applications comprise an increasingly important workload domain for many fields,
including bioinformatics, chemistry, physics, social sciences and machine learning …

Notifying Memories for Dataflow Applications on Shared-Memory Parallel Computer

A Ghasemi - 2022 - theses.hal.science
Symmetric Shared-memory multiprocessor~(SMP) is the most widely used implementation
of high-performance multi-core processors. It offers a uniform shared memory view that …

CASPAR: breaking serialization in lock-free multicore synchronization

T Gangwani, A Morrison, J Torrellas - ACM SIGARCH Computer …, 2016 - dl.acm.org
In multicores, performance-critical synchronization is increasingly performed in a lock-free
manner using atomic instructions such as CAS or LL/SC. However, when many processors …

An efficient and flexible hardware support for accelerating synchronization operations on the sthorm many-core architecture

F Thabet, Y Lhuillier, C Andriamisaina… - … , Automation & Test …, 2013 - ieeexplore.ieee.org
The current trend in embedded computing consists in increasing the number of processing
resources on a chip. Following this paradigm, the STMicroelectronics/CEA Platform 2012 …