Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Dnnfusion: accelerating deep neural networks execution with advanced operator fusion
Deep Neural Networks (DNNs) have emerged as the core enabler of many major
applications on mobile devices. To achieve high accuracy, DNN models have become …
applications on mobile devices. To achieve high accuracy, DNN models have become …
Guided equality saturation
Rewriting is a principled term transformation technique with uses across theorem proving
and compilation. In theorem proving, each rewrite is a proof step; in compilation, rewrites …
and compilation. In theorem proving, each rewrite is a proof step; in compilation, rewrites …
Optimizing Direct Convolutions on ARM Multi-Cores
Convolution kernels are widely seen in deep learning workloads and are often responsible
for performance bottlenecks. Recent research has demonstrated that a direct convolution …
for performance bottlenecks. Recent research has demonstrated that a direct convolution …
Neural architecture search as program transformation exploration
Improving the performance of deep neural networks (DNNs) is important to both the compiler
and neural architecture search (NAS) communities. Compilers apply program …
and neural architecture search (NAS) communities. Compilers apply program …
mGEMM: Low-latency convolution with minimal memory overhead optimized for mobile devices
The convolution layer is the key building block in many neural network designs. Most high-
performance implementations of the convolution operation rely on GEMM (General Matrix …
performance implementations of the convolution operation rely on GEMM (General Matrix …
cuConv: CUDA implementation of convolution for CNN inference
Convolutions are the core operation of deep learning applications based on Convolutional
Neural Networks (CNNs). Current GPU architectures are highly efficient for training and …
Neural Networks (CNNs). Current GPU architectures are highly efficient for training and …
High performance dilated convolutions on multi-core DSPs
Dilated convolutions are widely used to accomplish wide receptive fields while kee** the
resolution of feature maps in deep learning applications, such as semantic segmentation …
resolution of feature maps in deep learning applications, such as semantic segmentation …
Map** parallelism in a functional IR through constraint satisfaction: a case study on convolution for mobile GPUs
Graphics Processing Units (GPUs) are notoriously hard to optimize for manually. What is
needed are good automatic code generators and optimizers. Accelerate, Futhark and Lift …
needed are good automatic code generators and optimizers. Accelerate, Futhark and Lift …
Sketch-guided equality saturation: Scaling equality saturation to complex optimizations of functional programs
Generating high-performance code for diverse hardware and application domains is
challenging. Functional array programming languages with patterns like map and reduce …
challenging. Functional array programming languages with patterns like map and reduce …
[PDF][PDF] Sketch-Guided Equality Saturation
Equality saturation is a technique for implementing rewritedriven compiler optimizations by
efficiently representing many equivalent programs in so-called e-graphs. To improve …
efficiently representing many equivalent programs in so-called e-graphs. To improve …