Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A survey on compiler autotuning using machine learning
Since the mid-1990s, researchers have been trying to use machine-learning-based
approaches to solve a number of different compiler optimization problems. These …
approaches to solve a number of different compiler optimization problems. These …
A practical automatic polyhedral parallelizer and locality optimizer
We present the design and implementation of an automatic polyhedral source-to-source
transformation framework that can optimize regular programs (sequences of possibly …
transformation framework that can optimize regular programs (sequences of possibly …
Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
The polyhedral model provides powerful abstractions to optimize loop nests with regular
accesses. Affine transformations in this model capture a complex sequence of execution …
accesses. Affine transformations in this model capture a complex sequence of execution …
[PDF][PDF] Pluto: A practical and fully automatic polyhedral program optimization system
We present the design and implementation of a fully automatic polyhedral source-to-source
transformation framework that can optimize regular programs (sequences of possibly …
transformation framework that can optimize regular programs (sequences of possibly …
Automatic C-to-CUDA code generation for affine programs
MM Baskaran, J Ramanujam… - … , CC 2010, Held as Part of …, 2010 - Springer
Abstract Graphics Processing Units (GPUs) offer tremendous computational power. CUDA
(Compute Unified Device Architecture) provides a multi-threaded parallel programming …
(Compute Unified Device Architecture) provides a multi-threaded parallel programming …
Polyhedral-based data reuse optimization for configurable computing
Many applications, such as medical imaging, generate intensive data traffic between the
FPGA and off-chip memory. Significant improvements in the execution time can be achieved …
FPGA and off-chip memory. Significant improvements in the execution time can be achieved …
[PDF][PDF] CHiLL: A framework for composing high-level loop transformations
C Chen, J Chame, M Hall - 2008 - Citeseer
This paper describes a general and robust loop transformation framework that enables
compilers to generate efficient code on complex loop nests. Despite two decades of prior …
compilers to generate efficient code on complex loop nests. Despite two decades of prior …
A compiler framework for optimization of affine loop nests for GPGPUs
MM Baskaran, U Bondhugula… - Proceedings of the …, 2008 - dl.acm.org
GPUs are a class of specialized parallel architectures with tremendous computational
power. The new Compute Unified Device Architecture (CUDA) programming model from …
power. The new Compute Unified Device Architecture (CUDA) programming model from …
Iterative optimization in the polyhedral model: Part II, multidimensional time
High-level loop optimizations are necessary to achieve good performance over a wide
variety of processors. Their performance impact can be significant because they involve in …
variety of processors. Their performance impact can be significant because they involve in …