Google Академик

AH Ashouri, W Killian, J Cavazos, G Palermo… - ACM Computing …, 2018 - dl.acm.org

Since the mid-1990s, researchers have been trying to use machine-learning-based
approaches to solve a number of different compiler optimization problems. These …

Сачувај Цитирај 286 пута наведен Сродни чланци Све верзије (8)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Deep configuration performance learning: A systematic survey and taxonomy

J Gong, T Chen - ACM Transactions on Software Engineering and …, 2024 - dl.acm.org

Performance is arguably the most crucial attribute that reflects the quality of a configurable
software system. However, given the increasing scale and complexity of modern software …

Сачувај Цитирај 10 пута наведен Сродни чланци Све верзије (5)

[Free GPT-4]
[DeepSeek]

[PDF] whiterose.ac.uk

End-to-end deep learning of optimization heuristics

C Cummins, P Petoumenos, Z Wang… - 2017 26th …, 2017 - ieeexplore.ieee.org

Accurate automatic optimization heuristics are necessary for dealing with thecomplexity and
diversity of modern hardware and software. Machine learning is aproven technique for …

Сачувај Цитирај 266 пута наведен Сродни чланци Све верзије (11)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models

RB Roy, T Patel, V Gadepally, D Tiwari - Proceedings of the 42nd ACM …, 2021 - dl.acm.org

As parallel applications become more complex, auto-tuning becomes more desirable,
challenging, and time-consuming. We propose, Bliss, a novel solution for auto-tuning …

Сачувај Цитирај 61 пута наведен Сродни чланци Све верзије (3)

Asymo: scalable and efficient deep-learning inference on asymmetric mobile cpus

M Wang, S Ding, T Cao, Y Liu, F Xu - Proceedings of the 27th Annual …, 2021 - dl.acm.org

On-device deep learning (DL) inference has attracted vast interest. Mobile CPUs are the
most common hardware for on-device inference and many inference frameworks have been …

Сачувај Цитирај 63 пута наведен Сродни чланци

[Free GPT-4]
[DeepSeek]

[PDF] manchester.ac.uk

Synthesizing benchmarks for predictive modeling

C Cummins, P Petoumenos, Z Wang… - 2017 IEEE/ACM …, 2017 - ieeexplore.ieee.org

Predictive modeling using machine learning is an effective method for building compiler
heuristics, but there is a shortage of benchmarks. Typical machine learning experiments …

Сачувај Цитирај 128 пута наведен Сродни чланци Све верзије (14)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CLBlast: A tuned OpenCL BLAS library

C Nugteren - Proceedings of the International Workshop on OpenCL, 2018 - dl.acm.org

This work introduces CLBlast, an open-source BLAS library providing optimized OpenCL
routines to accelerate dense linear algebra for a wide variety of devices. It is targeted at …

Сачувај Цитирај 113 пута наведен Сродни чланци Све верзије (3)

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Kernel Tuner: A search-optimizing GPU code auto-tuner

B van Werkhoven - Future Generation Computer Systems, 2019 - Elsevier

A very common problem in GPU programming is that some combination of thread block
dimensions and other code optimization parameters, like tiling or unrolling factors, results in …

Сачувај Цитирај 93 пута наведен Сродни чланци Све верзије (3)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with Kernel Tuning Toolkit

F Petrovič, D Střelák, J Hozzová, J Ol'ha… - Future Generation …, 2020 - Elsevier

In recent years, the heterogeneity of both commodity and supercomputers hardware has
increased sharply. Accelerators, such as GPUs or Intel Xeon Phi co-processors, are often …

Сачувај Цитирај 56 пута наведен Сродни чланци Све верзије (10)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Romou: Rapidly generate high-performance tensor kernels for mobile gpus

R Liang, T Cao, J Wen, M Wang, Y Wang… - Proceedings of the 28th …, 2022 - dl.acm.org

Mobile GPU, as a ubiquitous and powerful accelerator, plays an important role in
accelerating on-device DNN (Deep Neural Network) inference. The frequent-upgrade and …

Сачувај Цитирај 15 пута наведен Сродни чланци Све верзије (2)

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Machine learning based auto-tuning for enhanced opencl performance portability

A survey on compiler autotuning using machine learning

Deep configuration performance learning: A systematic survey and taxonomy

End-to-end deep learning of optimization heuristics

Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models

Asymo: scalable and efficient deep-learning inference on asymmetric mobile cpus

Synthesizing benchmarks for predictive modeling

CLBlast: A tuned OpenCL BLAS library

[HTML][HTML] Kernel Tuner: A search-optimizing GPU code auto-tuner

A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with Kernel Tuning Toolkit

Romou: Rapidly generate high-performance tensor kernels for mobile gpus