Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Program reconditioning: Avoiding undefined behaviour when finding and reducing compiler bugs
We introduce program reconditioning, a method for allowing program generation and
differential testing to be used to find miscompilation bugs, and test-case reduction to be used …
differential testing to be used to find miscompilation bugs, and test-case reduction to be used …
Portable inter-workgroup barrier synchronisation for GPUs
Despite the growing popularity of GPGPU programming, there is not yet a portable and
formally-specified barrier that one can use to synchronise across workgroups. Moreover, the …
formally-specified barrier that one can use to synchronise across workgroups. Moreover, the …
[PDF][PDF] Towards Unified Analysis of GPU Consistency
H Tong, N Gavrilenko… - 29th ACM …, 2024 - hernanponcedeleon.github.io
After more than 30 years of research, there is a solid understanding of the consistency
guarantees given by CPU systems. Unfortunately, the same is not yet true for GPUs. The …
guarantees given by CPU systems. Unfortunately, the same is not yet true for GPUs. The …
Parallel fractal image compression using quadtree partition with task and dynamic parallelism
Fractal image compression is a lossy compression technique based on the iterative function
system, which can be used to reduce the storage space and increase the speed of data …
system, which can be used to reduce the storage space and increase the speed of data …
Gpuharbor: Testing gpu memory consistency at large (experience paper)
Memory consistency specifications (MCSs) are a difficult, yet critical, part of a concurrent
programming framework. Existing MCS testing tools are not immediately accessible, and …
programming framework. Existing MCS testing tools are not immediately accessible, and …
Automated test generation for OpenCL kernels using fuzzing and constraint solving
Graphics Processing Units (GPUs) are massively parallel processors offering performance
acceleration and energy efficiency unmatched by current processors (CPUs) in computers …
acceleration and energy efficiency unmatched by current processors (CPUs) in computers …
Redwood: Flexible and Portable Heterogeneous Tree Traversal Workloads
Shared memory heterogeneous systems are now mainstream, with nearly every mobile
phone and tablet containing integrated processing units. However, develo** applications …
phone and tablet containing integrated processing units. However, develo** applications …
GPU schedulers: how fair is fair enough?
Blocking synchronisation idioms, eg mutexes and barriers, play an important role in
concurrent programming. However, systems with semi-fair schedulers, eg graphics …
concurrent programming. However, systems with semi-fair schedulers, eg graphics …
Training progressively binarizing deep networks using FPGAs
While hardware implementations of inference routines for Binarized Neural Networks
(BNNs) are plentiful, current realizations of efficient BNN hardware training accelerators …
(BNNs) are plentiful, current realizations of efficient BNN hardware training accelerators …
Cltestcheck: Measuring test effectiveness for gpu kernels
Massive parallelism, and energy efficiency of GPUs, along with advances in their
programmability with OpenCL and CUDA programming models have made them attractive …
programmability with OpenCL and CUDA programming models have made them attractive …