Edge learning using a fully integrated neuro-inspired memristor chip

W Zhang, P Yao, B Gao, Q Liu, D Wu, Q Zhang, Y Li… - Science, 2023 - science.org
Learning is highly important for edge intelligence devices to adapt to different application
scenes and owners. Current technologies for training neural networks require moving …

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

YS Shao, J Clemons, R Venkatesan, B Zimmer… - Proceedings of the …, 2019 - dl.acm.org
Package-level integration using multi-chip-modules (MCMs) is a promising approach for
building large-scale systems. Compared to a large monolithic die, an MCM combines many …

Architecture of computing system based on chiplet

G Shan, Y Zheng, C **ng, D Chen, G Li, Y Yang - Micromachines, 2022 - mdpi.com
Computing systems are widely used in medical diagnosis, climate prediction, autonomous
vehicles, etc. As the key part of electronics, the performance of computing systems is crucial …

Tensordimm: A practical near-memory processing architecture for embeddings and tensor operations in deep learning

Y Kwon, Y Lee, M Rhu - Proceedings of the 52nd Annual IEEE/ACM …, 2019 - dl.acm.org
Recent studies from several hyperscalars pinpoint to embedding layers as the most memory-
intensive deep learning (DL) algorithm being deployed in today's datacenters. This paper …

Non-deep networks

A Goyal, A Bochkovskiy, J Deng… - Advances in neural …, 2022 - proceedings.neurips.cc
Latency is of utmost importance in safety-critical systems. In neural networks, lowest
theoretical latency is dependent on the depth of the network. This begs the question--is it …

Using Chiplet Encapsulation Technology to Achieve Processing-in-Memory Functions

W Tian, B Li, Z Li, H Cui, J Shi, Y Wang, J Zhao - Micromachines, 2022 - mdpi.com
With the rapid development of 5G, artificial intelligence (AI), and high-performance
computing (HPC), there is a huge increase in the data exchanged between the processor …

A survey on deep learning hardware accelerators for heterogeneous hpc platforms

C Silvano, D Ielmini, F Ferrandi, L Fiorin… - arxiv preprint arxiv …, 2023 - arxiv.org
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable
solution for several classes of high-performance computing (HPC) applications such as …

Summarizing CPU and GPU design trends with product data

Y Sun, NB Agostini, S Dong, D Kaeli - arxiv preprint arxiv:1911.11313, 2019 - arxiv.org
Moore's Law and Dennard Scaling have guided the semiconductor industry for the past few
decades. Recently, both laws have faced validity challenges as transistor sizes approach …

Centaur: A chiplet-based, hybrid sparse-dense accelerator for personalized recommendations

R Hwang, T Kim, Y Kwon, M Rhu - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Personalized recommendations are the backbone machine learning (ML) algorithm that
powers several important application domains (eg, ads, e-commerce, etc) serviced from …

Energy-efficient artificial intelligence of things with intelligent edge

S Zhu, K Ota, M Dong - IEEE Internet of Things Journal, 2022 - ieeexplore.ieee.org
Artificial Intelligence of Things (AIoT) is an emerging area of future Internet of Things (IoT) to
support intelligent IoT applications. In AIoT, intelligent edge computing technologies …