Google Академія

W Niu, J Guan, Y Wang, G Agrawal, B Ren - Proceedings of the 42nd …, 2021 - dl.acm.org

Deep Neural Networks (DNNs) have emerged as the core enabler of many major
applications on mobile devices. To achieve high accuracy, DNN models have become …

Зберегти Послатися Цитовано в 157 джерелах Пов’язані статті Кількість версій: 7

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Guided equality saturation

T Koehler, A Goens, S Bhat, T Grosser… - Proceedings of the …, 2024 - dl.acm.org

Rewriting is a principled term transformation technique with uses across theorem proving
and compilation. In theorem proving, each rewrite is a proof step; in compilation, rewrites …

Зберегти Послатися Цитовано в 6 джерелах Пов’язані статті Кількість версій: 13

[Free GPT-4]
[DeepSeek]

[PDF] whiterose.ac.uk

Optimizing Direct Convolutions on ARM Multi-Cores

P Wang, W Yang, J Fang, D Dong, C Huang… - Proceedings of the …, 2023 - dl.acm.org

Convolution kernels are widely seen in deep learning workloads and are often responsible
for performance bottlenecks. Recent research has demonstrated that a direct convolution …

Зберегти Послатися Цитовано в 5 джерелах Пов’язані статті Кількість версій: 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Neural architecture search as program transformation exploration

J Turner, EJ Crowley, MFP O'Boyle - Proceedings of the 26th ACM …, 2021 - dl.acm.org

Improving the performance of deep neural networks (DNNs) is important to both the compiler
and neural architecture search (NAS) communities. Compilers apply program …

Зберегти Послатися Цитовано в 19 джерелах Пов’язані статті Кількість версій: 9

mGEMM: Low-latency convolution with minimal memory overhead optimized for mobile devices

J Park, K Bin, K Lee - Proceedings of the 20th Annual International …, 2022 - dl.acm.org

The convolution layer is the key building block in many neural network designs. Most high-
performance implementations of the convolution operation rely on GEMM (General Matrix …

Зберегти Послатися Цитовано в 8 джерелах Пов’язані статті Кількість версій: 2

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

cuConv: CUDA implementation of convolution for CNN inference

M Jordà, P Valero-Lara, AJ Peña - Cluster Computing, 2022 - Springer

Convolutions are the core operation of deep learning applications based on Convolutional
Neural Networks (CNNs). Current GPU architectures are highly efficient for training and …

Зберегти Послатися Цитовано в 14 джерелах Пов’язані статті Кількість версій: 6

High performance dilated convolutions on multi-core DSPs

Y Wang, Q Wang, X Pei, S Mei, R Li, J Liu - CCF Transactions on High …, 2024 - Springer

Dilated convolutions are widely used to accomplish wide receptive fields while kee** the
resolution of feature maps in deep learning applications, such as semantic segmentation …

Зберегти Послатися Цитовано в 4 джерелах Пов’язані статті Кількість версій: 2

[Free GPT-4]
[DeepSeek]

[PDF] ed.ac.uk

Map** parallelism in a functional IR through constraint satisfaction: a case study on convolution for mobile GPUs

N Mogers, L Li, V Radu, C Dubach - Proceedings of the 31st ACM …, 2022 - dl.acm.org

Graphics Processing Units (GPUs) are notoriously hard to optimize for manually. What is
needed are good automatic code generators and optimizers. Accelerate, Futhark and Lift …

Зберегти Послатися Цитовано в 6 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sketch-guided equality saturation: Scaling equality saturation to complex optimizations of functional programs

T Koehler, P Trinder, M Steuwer - arxiv preprint arxiv:2111.13040, 2021 - arxiv.org

Generating high-performance code for diverse hardware and application domains is
challenging. Functional array programming languages with patterns like map and reduce …

Зберегти Послатися Цитовано в 4 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] thok.eu

[PDF][PDF] Sketch-Guided Equality Saturation

T Koehler, P Trinder, M Steuwer - 2022 - thok.eu

Equality saturation is a technique for implementing rewritedriven compiler optimizations by
efficiently representing many equivalent programs in so-called e-graphs. To improve …

Зберегти Послатися Цитовано в 2 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Automatic generation of specialized direct convolutions for mobile GPUs

Dnnfusion: accelerating deep neural networks execution with advanced operator fusion

Guided equality saturation

Optimizing Direct Convolutions on ARM Multi-Cores

Neural architecture search as program transformation exploration

mGEMM: Low-latency convolution with minimal memory overhead optimized for mobile devices

cuConv: CUDA implementation of convolution for CNN inference

High performance dilated convolutions on multi-core DSPs

Map** parallelism in a functional IR through constraint satisfaction: a case study on convolution for mobile GPUs

Sketch-guided equality saturation: Scaling equality saturation to complex optimizations of functional programs

[PDF][PDF] Sketch-Guided Equality Saturation