- Academic Search

A Mirhoseini, A Goldie, M Yazgan, JW Jiang… - Nature, 2021 - nature.com

Chip floorplanning is the engineering task of designing the physical layout of a computer
chip. Despite five decades of research 1, chip floorplanning has defied automation, requiring …

Save Cite Cited by 658 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

Save Cite Cited by 30 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Full stack optimization of transformer inference: a survey

S Kim, C Hooper, T Wattanawong, M Kang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent advances in state-of-the-art DNN architecture design have been moving toward
Transformer models. These models achieve superior accuracy across a wide range of …

Save Cite Cited by 95 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Evaluating language models for efficient code generation

J Liu, S **e, J Wang, Y Wei, Y Ding, L Zhang - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Differential Performance Evaluation (DPE), a framework designed to reliably
evaluate Large Language Models (LLMs) for efficient code generation. Traditional coding …

Save Cite Cited by 10 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Surco: Learning linear surrogates for combinatorial nonlinear optimization problems

AM Ferber, T Huang, D Zha… - International …, 2023 - proceedings.mlr.press

Optimization problems with nonlinear cost functions and combinatorial constraints appear in
many real-world applications but remain challenging to solve efficiently compared to their …

Save Cite Cited by 29 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Hasco: Towards agile hardware and software co-design for tensor computation

Q **ao, S Zheng, B Wu, P Xu, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Tensor computations overwhelm traditional general-purpose computing devices due to the
large amounts of data and operations of the computations. They call for a holistic solution …

Save Cite Cited by 75 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] mlsys.org

A learned performance model for tensor processing units

S Kaufman, P Phothilimthana, Y Zhou… - Proceedings of …, 2021 - proceedings.mlsys.org

Accurate hardware performance models are critical to efficient code generation. They can be
used by compilers to make heuristic decisions, by superoptimizers as a minimization …

Save Cite Cited by 90 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] acm.org

A full-stack search technique for domain optimized deep learning accelerators

D Zhang, S Huda, E Songhori, K Prabhu, Q Le… - Proceedings of the 27th …, 2022 - dl.acm.org

The rapidly-changing deep learning landscape presents a unique opportunity for building
inference accelerators optimized for specific datacenter-scale workloads. We propose Full …

Save Cite Cited by 67 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Piper: Multidimensional planner for dnn parallelization

JM Tarnawski, D Narayanan… - Advances in Neural …, 2021 - proceedings.neurips.cc

The rapid increase in sizes of state-of-the-art DNN models, and consequently the increase in
the compute and memory requirements of model training, has led to the development of …

Save Cite Cited by 59 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Robust scheduling with gflownets

DW Zhang, C Rainone, M Peschl… - arxiv preprint arxiv …, 2023 - arxiv.org

Finding the best way to schedule operations in a computation graph is a classical NP-hard
problem which is central to compiler optimization. However, evaluating the goodness of a …

Save Cite Cited by 49 Related articles All 5 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Transferable graph optimizers for ml compilers

A graph placement methodology for fast chip design

Enabling resource-efficient aiot system with cross-level optimization: A survey

Full stack optimization of transformer inference: a survey

Evaluating language models for efficient code generation

Surco: Learning linear surrogates for combinatorial nonlinear optimization problems

Hasco: Towards agile hardware and software co-design for tensor computation

A learned performance model for tensor processing units

A full-stack search technique for domain optimized deep learning accelerators

Piper: Multidimensional planner for dnn parallelization

Robust scheduling with gflownets