- Academic Search

YH Lai, E Ustun, S **ang, Z Fang, H Rong… - ACM Transactions on …, 2021 - dl.acm.org

FPGA-based accelerators are increasingly popular across a broad range of applications,
because they offer massive parallelism, high energy efficiency, and great flexibility for …

Zapisz Cytuj Cytowane przez 50 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{TVM}: An automated {End-to-End} optimizing compiler for deep learning

T Chen, T Moreau, Z Jiang, L Zheng, E Yan… - … USENIX Symposium on …, 2018 - usenix.org

There is an increasing need to bring machine learning to a wide diversity of hardware
devices. Current frameworks rely on vendor-specific operator libraries and optimize for a …

Zapisz Cytuj Cytowane przez 2073 Powiązane artykuły Wszystkie wersje 22 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Timeloop: A systematic approach to dnn accelerator evaluation

A Parashar, P Raina, YS Shao, YH Chen… - … analysis of systems …, 2019 - ieeexplore.ieee.org

This paper presents Timeloop, an infrastructure for evaluating and exploring the architecture
design space of deep neural network (DNN) accelerators. Timeloop uses a concise and …

Zapisz Cytuj Cytowane przez 488 Powiązane artykuły Wszystkie wersje 13

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

The sparse polyhedral framework: Composing compiler-generated inspector-executor code

MM Strout, M Hall, C Olschanowsky - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org

Irregular applications such as big graph analysis, material simulations, molecular dynamics
simulations, and finite element analysis have performance problems due to their use of …

Zapisz Cytuj Cytowane przez 91 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

[KSIĄŻKA][B] Efficient processing of deep neural networks

V Sze, YH Chen, TJ Yang, JS Emer - 2020 - Springer

This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …

Zapisz Cytuj Cytowane przez 294 Powiązane artykuły Wszystkie wersje 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Full stack optimization of transformer inference: a survey

S Kim, C Hooper, T Wattanawong, M Kang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent advances in state-of-the-art DNN architecture design have been moving toward
Transformer models. These models achieve superior accuracy across a wide range of …

Zapisz Cytuj Cytowane przez 97 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Learning to optimize tensor programs

T Chen, L Zheng, E Yan, Z Jiang… - Advances in …, 2018 - proceedings.neurips.cc

We introduce a learning-based framework to optimize tensor programs for deep learning
workloads. Efficient implementations of tensor operators, such as matrix multiplication and …

Zapisz Cytuj Cytowane przez 499 Powiązane artykuły Wszystkie wersje 18 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions

N Vasilache, O Zinenko, T Theodoridis, P Goyal… - arxiv preprint arxiv …, 2018 - arxiv.org

Deep learning models with convolutional and recurrent networks are now ubiquitous and
analyze massive amounts of audio, image, video, text and graph data, with applications in …

Zapisz Cytuj Cytowane przez 526 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Taichi: a language for high-performance computation on spatially sparse data structures

Y Hu, TM Li, L Anderson, J Ragan-Kelley… - ACM Transactions on …, 2019 - dl.acm.org

3D visual computing data are often spatially sparse. To exploit such sparsity, people have
developed hierarchical sparse data structures, such as multi-level sparse voxel grids …

Zapisz Cytuj Cytowane przez 341 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Dnnfusion: accelerating deep neural networks execution with advanced operator fusion

W Niu, J Guan, Y Wang, G Agrawal, B Ren - Proceedings of the 42nd …, 2021 - dl.acm.org

Deep Neural Networks (DNNs) have emerged as the core enabler of many major
applications on mobile devices. To achieve high accuracy, DNN models have become …

Zapisz Cytuj Cytowane przez 155 Powiązane artykuły Wszystkie wersje 7

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

The tensor algebra compiler

Programming and synthesis for software-defined FPGA acceleration: status and future prospects

{TVM}: An automated {End-to-End} optimizing compiler for deep learning

Timeloop: A systematic approach to dnn accelerator evaluation

The sparse polyhedral framework: Composing compiler-generated inspector-executor code

[KSIĄŻKA][B] Efficient processing of deep neural networks

Full stack optimization of transformer inference: a survey

Learning to optimize tensor programs

Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions

Taichi: a language for high-performance computation on spatially sparse data structures

Dnnfusion: accelerating deep neural networks execution with advanced operator fusion