Google neural network models for edge devices: Analyzing and mitigating machine learning inference bottlenecks

A Boroumand, S Ghose, B Akin… - 2021 30th …, 2021 - ieeexplore.ieee.org
Emerging edge computing platforms often contain machine learning (ML) accelerators that
can accelerate inference for a wide range of neural network (NN) models. These models are …

Co-exploration of neural architectures and heterogeneous asic accelerator designs targeting multiple tasks

L Yang, Z Yan, M Li, H Kwon, L Lai… - 2020 57th ACM/IEEE …, 2020 - ieeexplore.ieee.org
Neural Architecture Search (NAS) has demonstrated its power on various AI accelerating
platforms such as Field Programmable Gate Arrays (FPGAs) and Graphic Processing Units …

CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture

J Zhuang, J Lau, H Ye, Z Yang, Y Du, J Lo… - Proceedings of the …, 2023 - dl.acm.org
Dense matrix multiply (MM) serves as one of the most heavily used kernels in deep learning
applications. To cope with the high computation demands of these applications …

Xrbench: An extended reality (xr) machine learning benchmark suite for the metaverse

H Kwon, K Nair, J Seo, J Yik… - Proceedings of …, 2023 - proceedings.mlsys.org
Real-time multi-task multi-model (MTMM) workloads, a new form of deep learning inference
workloads, are emerging for applications areas like extended reality (XR) to support …

MoCA: Memory-centric, adaptive execution for multi-tenant deep neural networks

S Kim, H Genc, VV Nikiforov, K Asanović… - … Symposium on High …, 2023 - ieeexplore.ieee.org
Driven by the wide adoption of deep neural networks (DNNs) across different application
domains, multi-tenancy execution, where multiple DNNs are deployed simultaneously on …

Creating the future: Augmented reality, the next human-machine interface

M Abrash - 2021 IEEE International Electron Devices Meeting …, 2021 - ieeexplore.ieee.org
XR, consisting of Virtual Reality (VR) and Augmented Reality (AR) together, will be the next
general computing platform, dominating our relationship with the digital world for the next 50 …

Highlight: Efficient and flexible dnn acceleration with hierarchical structured sparsity

YN Wu, PA Tsai, S Muralidharan, A Parashar… - Proceedings of the 56th …, 2023 - dl.acm.org
Due to complex interactions among various deep neural network (DNN) optimization
techniques, modern DNNs can have weights and activations that are dense or sparse with …

Tileflow: A framework for modeling fusion dataflow via tree-based analysis

S Zheng, S Chen, S Gao, L Jia, G Sun… - Proceedings of the 56th …, 2023 - dl.acm.org
With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …

An architecture-level analysis on deep learning models for low-impact computations

H Li, Z Wang, X Yue, W Wang, H Tomiyama… - Artificial Intelligence …, 2023 - Springer
Deep neural networks (DNNs) have made significant achievements in a wide variety of
domains. For the deep learning tasks, multiple excellent hardware platforms provide efficient …

Reconfigurability, why it matters in ai tasks processing: A survey of reconfigurable ai chips

S Wei, X Lin, F Tu, Y Wang, L Liu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Nowadays, artificial intelligence (AI) technologies, especially deep neural networks (DNNs),
play an vital role in solving many problems in both academia and industry. In order to …