Google neural network models for edge devices: Analyzing and mitigating machine learning inference bottlenecks
Emerging edge computing platforms often contain machine learning (ML) accelerators that
can accelerate inference for a wide range of neural network (NN) models. These models are …
can accelerate inference for a wide range of neural network (NN) models. These models are …
Co-exploration of neural architectures and heterogeneous asic accelerator designs targeting multiple tasks
Neural Architecture Search (NAS) has demonstrated its power on various AI accelerating
platforms such as Field Programmable Gate Arrays (FPGAs) and Graphic Processing Units …
platforms such as Field Programmable Gate Arrays (FPGAs) and Graphic Processing Units …
CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture
Dense matrix multiply (MM) serves as one of the most heavily used kernels in deep learning
applications. To cope with the high computation demands of these applications …
applications. To cope with the high computation demands of these applications …
Xrbench: An extended reality (xr) machine learning benchmark suite for the metaverse
Real-time multi-task multi-model (MTMM) workloads, a new form of deep learning inference
workloads, are emerging for applications areas like extended reality (XR) to support …
workloads, are emerging for applications areas like extended reality (XR) to support …
MoCA: Memory-centric, adaptive execution for multi-tenant deep neural networks
Driven by the wide adoption of deep neural networks (DNNs) across different application
domains, multi-tenancy execution, where multiple DNNs are deployed simultaneously on …
domains, multi-tenancy execution, where multiple DNNs are deployed simultaneously on …
Creating the future: Augmented reality, the next human-machine interface
M Abrash - 2021 IEEE International Electron Devices Meeting …, 2021 - ieeexplore.ieee.org
XR, consisting of Virtual Reality (VR) and Augmented Reality (AR) together, will be the next
general computing platform, dominating our relationship with the digital world for the next 50 …
general computing platform, dominating our relationship with the digital world for the next 50 …
Highlight: Efficient and flexible dnn acceleration with hierarchical structured sparsity
Due to complex interactions among various deep neural network (DNN) optimization
techniques, modern DNNs can have weights and activations that are dense or sparse with …
techniques, modern DNNs can have weights and activations that are dense or sparse with …
Tileflow: A framework for modeling fusion dataflow via tree-based analysis
With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …
An architecture-level analysis on deep learning models for low-impact computations
Deep neural networks (DNNs) have made significant achievements in a wide variety of
domains. For the deep learning tasks, multiple excellent hardware platforms provide efficient …
domains. For the deep learning tasks, multiple excellent hardware platforms provide efficient …
Reconfigurability, why it matters in ai tasks processing: A survey of reconfigurable ai chips
Nowadays, artificial intelligence (AI) technologies, especially deep neural networks (DNNs),
play an vital role in solving many problems in both academia and industry. In order to …
play an vital role in solving many problems in both academia and industry. In order to …