A graph placement methodology for fast chip design

A Mirhoseini, A Goldie, M Yazgan, JW Jiang… - Nature, 2021 - nature.com
Chip floorplanning is the engineering task of designing the physical layout of a computer
chip. Despite five decades of research 1, chip floorplanning has defied automation, requiring …

A survey of machine learning for computer architecture and systems

N Wu, Y **e - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
It has been a long time that computer architecture and systems are optimized for efficient
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …

Learning scheduling algorithms for data processing clusters

H Mao, M Schwarzkopf, SB Venkatakrishnan… - Proceedings of the …, 2019 - dl.acm.org
Efficiently scheduling data processing jobs on distributed compute clusters requires complex
algorithms. Current systems use simple, generalized heuristics and ignore workload …

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

Chip placement with deep reinforcement learning

A Mirhoseini, A Goldie, M Yazgan, J Jiang… - arxiv preprint arxiv …, 2020 - arxiv.org
In this work, we present a learning-based approach to chip placement, one of the most
complex and time-consuming stages of the chip design process. Unlike prior methods, our …

SiP-ML: high-bandwidth optical network interconnects for machine learning training

M Khani, M Ghobadi, M Alizadeh, Z Zhu… - Proceedings of the …, 2021 - dl.acm.org
This paper proposes optical network interconnects as a key enabler for building high-
bandwidth ML training clusters with strong scaling properties. Our design, called SiP-ML …

Verifying learning-augmented systems

T Eliyahu, Y Kazak, G Katz, M Schapira - Proceedings of the 2021 ACM …, 2021 - dl.acm.org
The application of deep reinforcement learning (DRL) to computer and networked systems
has recently gained significant popularity. However, the obscurity of decisions by DRL …

Dreamshard: Generalizable embedding table placement for recommender systems

D Zha, L Feng, Q Tan, Z Liu, KH Lai… - Advances in …, 2022 - proceedings.neurips.cc
We study embedding table placement for distributed recommender systems, which aims to
partition and place the tables on multiple hardware devices (eg, GPUs) to balance the …

A learned performance model for tensor processing units

S Kaufman, P Phothilimthana, Y Zhou… - Proceedings of …, 2021 - proceedings.mlsys.org
Accurate hardware performance models are critical to efficient code generation. They can be
used by compilers to make heuristic decisions, by superoptimizers as a minimization …

Piper: Multidimensional planner for dnn parallelization

JM Tarnawski, D Narayanan… - Advances in Neural …, 2021 - proceedings.neurips.cc
The rapid increase in sizes of state-of-the-art DNN models, and consequently the increase in
the compute and memory requirements of model training, has led to the development of …