- Academic Search

R Xu, S Ma, Y Guo, D Li - ACM Computing Surveys, 2023 - dl.acm.org

In recent years, it has been witnessed that the systolic array is a successful architecture for
DNN hardware accelerators. However, the design of systolic arrays also encountered many …

Zapisz Cytuj Cytowane przez 41 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]

[PDF] scichina.com

In-memory computing with emerging nonvolatile memory devices

C Cheng, PJ Tiw, Y Cai, X Yan, Y Yang… - Science China Information …, 2021 - Springer

The von Neumann bottleneck and memory wall have posed fundamental limitations in
latency and energy consumption of modern computers based on von Neumann architecture …

Zapisz Cytuj Cytowane przez 56 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]

[PDF] arxiv.org

A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022 - Springer

Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

Zapisz Cytuj Cytowane przez 242 Powiązane artykuły Wszystkie wersje 6

[Free GPT-4]

[PDF] acm.org

Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology

V Seshadri, D Lee, T Mullins, H Hassan… - Proceedings of the 50th …, 2017 - dl.acm.org

Many important applications trigger bulk bitwise operations, ie, bitwise operations on large
bit vectors. In fact, recent works design techniques that exploit fast bulk bitwise operations to …

Zapisz Cytuj Cytowane przez 670 Powiązane artykuły Wszystkie wersje 13

[Free GPT-4]

[PDF] usenix.org

{LegoOS}: A disseminated, distributed {OS} for hardware resource disaggregation

Y Shan, Y Huang, Y Chen, Y Zhang - 13th USENIX Symposium on …, 2018 - usenix.org

The monolithic server model where a server is the unit of deployment, operation, and failure
is meeting its limits in the face of several recent hardware and application trends. To improve …

Zapisz Cytuj Cytowane przez 448 Powiązane artykuły Wszystkie wersje 22 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Neural cache: Bit-serial in-cache acceleration of deep neural networks

C Eckert, X Wang, J Wang… - 2018 ACM/IEEE …, 2018 - ieeexplore.ieee.org

This paper presents the Neural Cache architecture, which re-purposes cache structures to
transform them into massively parallel compute units capable of running inferences for Deep …

Zapisz Cytuj Cytowane przez 478 Powiązane artykuły Wszystkie wersje 14

[Free GPT-4]

[PDF] ieee.org

Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system

J Gómez-Luna, I El Hajj, I Fernandez… - IEEE …, 2022 - ieeexplore.ieee.org

Many modern workloads, such as neural networks, databases, and graph processing, are
fundamentally memory-bound. For such workloads, the data movement between main …

Zapisz Cytuj Cytowane przez 116 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]

[PDF] acm.org

Drisa: A dram-based reconfigurable in-situ accelerator

S Li, D Niu, KT Malladi, H Zheng, B Brennan… - Proceedings of the 50th …, 2017 - dl.acm.org

Data movement between the processing units and the memory in traditional von Neumann
architecture is creating the" memory wall" problem. To bridge the gap, two approaches, the …

Zapisz Cytuj Cytowane przez 487 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]

[PDF] illinois.edu

Google workloads for consumer devices: Mitigating data movement bottlenecks

A Boroumand, S Ghose, Y Kim… - Proceedings of the …, 2018 - dl.acm.org

We are experiencing an explosive growth in the number of consumer devices, including
smartphones, tablets, web-based computers such as Chromebooks, and wearable devices …

Zapisz Cytuj Cytowane przez 440 Powiązane artykuły Wszystkie wersje 21

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

X Zou, S Xu, X Chen, L Yan, Y Han - Science China Information Sciences, 2021 - Springer

The “memory wall” problem or so-called von Neumann bottleneck limits the efficiency of
conventional computer architectures, which move data from memory to CPU for …

Zapisz Cytuj Cytowane przez 186 Powiązane artykuły Wszystkie wersje 5

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture

A Survey of Design and Optimization for Systolic Array-based DNN Accelerators

In-memory computing with emerging nonvolatile memory devices

A modern primer on processing in memory

Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology

{LegoOS}: A disseminated, distributed {OS} for hardware resource disaggregation

Neural cache: Bit-serial in-cache acceleration of deep neural networks

Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system

Drisa: A dram-based reconfigurable in-situ accelerator

Google workloads for consumer devices: Mitigating data movement bottlenecks

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology