- Academic Search

D Zha, ZP Bhat, KH Lai, F Yang, X Hu - Proceedings of the 2023 SIAM …, 2023 - SIAM

The role of data in building AI systems has recently been significantly magnified by the
emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model …

Opslaan Citeren Geciteerd door 113 Verwante artikelen Alle 7 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

Opslaan Citeren Geciteerd door 30 Verwante artikelen Alle 6 versies

A graph placement methodology for fast chip design

A Mirhoseini, A Goldie, M Yazgan, JW Jiang… - Nature, 2021 - nature.com

Chip floorplanning is the engineering task of designing the physical layout of a computer
chip. Despite five decades of research, chip floorplanning has defied automation, requiring …

Opslaan Citeren Geciteerd door 671 Verwante artikelen Alle 6 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Chip placement with deep reinforcement learning

A Mirhoseini, A Goldie, M Yazgan, J Jiang… - arxiv preprint arxiv …, 2020 - arxiv.org

In this work, we present a learning-based approach to chip placement, one of the most
complex and time-consuming stages of the chip design process. Unlike prior methods, our …

Opslaan Citeren Geciteerd door 263 Verwante artikelen Alle 6 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{TopoOpt}: Co-optimizing network topology and parallelization strategy for distributed training jobs

W Wang, M Khazraee, Z Zhong, M Ghobadi… - … USENIX Symposium on …, 2023 - usenix.org

We propose TopoOpt, a novel direct-connect fabric for deep neural network (DNN) training
workloads. TopoOpt co-optimizes the distributed training process across three dimensions …

Opslaan Citeren Geciteerd door 77 Verwante artikelen Alle 12 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

SiP-ML: high-bandwidth optical network interconnects for machine learning training

M Khani, M Ghobadi, M Alizadeh, Z Zhu… - Proceedings of the …, 2021 - dl.acm.org

This paper proposes optical network interconnects as a key enabler for building high-
bandwidth ML training clusters with strong scaling properties. Our design, called SiP-ML …

Opslaan Citeren Geciteerd door 100 Verwante artikelen Alle 9 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Robust scheduling with GFlowNets

DW Zhang, C Rainone, M Peschl… - arxiv preprint arxiv …, 2023 - arxiv.org

Finding the best way to schedule operations in a computation graph is a classical NP-hard
problem which is central to compiler optimization. However, evaluating the goodness of a …

Opslaan Citeren Geciteerd door 52 Verwante artikelen Alle 6 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] sigcomm.org

Verifying learning-augmented systems

T Eliyahu, Y Kazak, G Katz, M Schapira - Proceedings of the 2021 ACM …, 2021 - dl.acm.org

The application of deep reinforcement learning (DRL) to computer and networked systems
has recently gained significant popularity. However, the obscurity of decisions by DRL …

Opslaan Citeren Geciteerd door 67 Verwante artikelen Alle 2 versies

[Free GPT-4]
[DeepSeek]

[PDF] mlsys.org

A learned performance model for tensor processing units

S Kaufman, P Phothilimthana, Y Zhou… - Proceedings of …, 2021 - proceedings.mlsys.org

Accurate hardware performance models are critical to efficient code generation. They can be
used by compilers to make heuristic decisions, by superoptimizers as a minimization …

Opslaan Citeren Geciteerd door 92 Verwante artikelen Alle 7 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Piper: Multidimensional planner for dnn parallelization

JM Tarnawski, D Narayanan… - Advances in Neural …, 2021 - proceedings.neurips.cc

The rapid increase in sizes of state-of-the-art DNN models, and consequently the increase in
the compute and memory requirements of model training, has led to the development of …

Opslaan Citeren Geciteerd door 57 Verwante artikelen Alle 6 versies HTML-versie

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Placeto: Learning generalizable device placement algorithms for distributed machine learning

Data-centric ai: Perspectives and challenges

Enabling resource-efficient aiot system with cross-level optimization: A survey

A graph placement methodology for fast chip design

Chip placement with deep reinforcement learning

{TopoOpt}: Co-optimizing network topology and parallelization strategy for distributed training jobs

SiP-ML: high-bandwidth optical network interconnects for machine learning training

Robust scheduling with GFlowNets

Verifying learning-augmented systems

A learned performance model for tensor processing units

Piper: Multidimensional planner for dnn parallelization