Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Cere: Llvm-based codelet extractor and replayer for piecewise benchmarking and optimization
This article presents Codelet Extractor and REplayer (CERE), an open-source framework for
code isolation. CERE finds and extracts the hotspots of an application as isolated fragments …
code isolation. CERE finds and extracts the hotspots of an application as isolated fragments …
Type-based gradual ty** performance optimization
Gradual ty** has emerged as a popular design point in programming languages,
attracting significant interests from both academia and industry. Programmers in gradually …
attracting significant interests from both academia and industry. Programmers in gradually …
[HTML][HTML] A DSL-based runtime adaptivity framework for Java
This article presents Kadabra, a Java source-to-source compiler that allows users to make
code queries, code analysis and code transformations, all user-programmable using the …
code queries, code analysis and code transformations, all user-programmable using the …
I/O Optimisation and elimination via partial evaluation
CSF Smowton - 2014 - cl.cam.ac.uk
Computer programs commonly repeat work. Short programs go through the same
initialisation sequence each time they are run, and long-running servers may be given a …
initialisation sequence each time they are run, and long-running servers may be given a …
Fast Template-Based Code Generation for MLIR
F Drescher, A Engelke - Proceedings of the 33rd ACM SIGPLAN …, 2024 - dl.acm.org
Fast compilation is essential for JIT-compilation use cases like dynamic languages or
databases as well as development productivity when compiling static languages. Template …
databases as well as development productivity when compiling static languages. Template …
[PDF][PDF] Efficient and scalable bit-matrix multiplication in bit-slice format
D Van Amstel - ACM SAC, 2012 - helcaraxan.eu
The bit-matrix multiplication (BMM) has until now only been implemented on the Cray
supercomputers. Since then multiple publications have proved the usefulness of this …
supercomputers. Since then multiple publications have proved the usefulness of this …
Microtools: Automating program generation and performance measurement
Tuning an application to a given architecture has become a complex procedure.
Sophisticated hardware obfuscates the path to easily writing peak-performance applications …
Sophisticated hardware obfuscates the path to easily writing peak-performance applications …
Improving performance through deep value profiling and specialization with code transformation
MA Khan - Computer Languages, Systems & Structures, 2011 - Elsevier
Specialization of code is used to improve the performance of the applications. However,
specialization based on ineffective profiles deteriorates the performance. Existing value …
specialization based on ineffective profiles deteriorates the performance. Existing value …
Improving performance of optimized kernels through fast instantiations of templates
To fully exploit the instruction‐level parallelism offered by modern processors, compilers
need the necessary information available during the execution of the program. This …
need the necessary information available during the execution of the program. This …
Feedback-directed specialization of code
MA Khan - Computer Languages, Systems & Structures, 2010 - Elsevier
Based on feedback information, a large number of optimizations can be performed by the
compiler. This information actually indicates the changing behavior of the applications and …
compiler. This information actually indicates the changing behavior of the applications and …