Google 학술 검색

A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications

L Liu, J Zhu, Z Li, Y Lu, Y Deng, J Han, S Yin… - ACM Computing …, 2019 - dl.acm.org

As general-purpose processors have hit the power wall and chip fabrication cost escalates
alarmingly, coarse-grained reconfigurable architectures (CGRAs) are attracting increasing …

저장 인용 234회 인용 관련 학술자료 전체 2개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] google.com

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

YS Shao, J Clemons, R Venkatesan, B Zimmer… - Proceedings of the …, 2019 - dl.acm.org

Package-level integration using multi-chip-modules (MCMs) is a promising approach for
building large-scale systems. Compared to a large monolithic die, an MCM combines many …

저장 인용 470회 인용 관련 학술자료 전체 3개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org

Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

저장 인용 108회 인용 관련 학술자료 전체 10개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] usc.edu

GraphP: Reducing communication for PIM-based graph processing with efficient data partition

M Zhang, Y Zhuo, C Wang, M Gao, Y Wu… - … Symposium on High …, 2018 - ieeexplore.ieee.org

Processing-In-Memory (PIM) is an effective technique that reduces data movements by
integrating processing units within memory. The recent advance of “big data” and 3D …

저장 인용 272회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Tangram: Optimized coarse-grained dataflow for scalable nn accelerators

M Gao, X Yang, J Pu, M Horowitz… - Proceedings of the Twenty …, 2019 - dl.acm.org

The use of increasingly larger and more complex neural networks (NNs) makes it critical to
scale the capabilities and efficiency of NN accelerators. Tiled architectures provide an …

저장 인용 196회 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Graphq: Scalable pim-based graph processing

Y Zhuo, C Wang, M Zhang, R Wang, D Niu… - Proceedings of the …, 2019 - dl.acm.org

Processing-In-Memory (PIM) architectures based on recent technology advances (eg,
Hybrid Memory Cube) demonstrate great potential for graph processing. However, existing …

저장 인용 175회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

NERO: A near high-bandwidth memory stencil accelerator for weather prediction modeling

G Singh, D Diamantopoulos… - … Conference on Field …, 2020 - ieeexplore.ieee.org

Ongoing climate change calls for fast and accurate weather and climate modeling. However,
when solving large-scale weather prediction simulations, state-of-the-art CPU and GPU …

저장 인용 98회 인용 관련 학술자료 전체 9개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

KPart: A hybrid cache partitioning-sharing technique for commodity multicores

N El-Sayed, A Mukkara, PA Tsai… - … Symposium on High …, 2018 - ieeexplore.ieee.org

Cache partitioning is now available in commercial hardware. In theory, software can
leverage cache partitioning to use the last-level cache better and improve performance. In …

저장 인용 166회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Mira: A program-behavior-guided far memory system

Z Guo, Z He, Y Zhang - Proceedings of the 29th Symposium on …, 2023 - dl.acm.org

Far memory, where memory accesses are non-local, has become more popular in recent
years as a solution to expand memory size and avoid memory stranding. Prior far memory …

저장 인용 17회 인용 관련 학술자료 전체 3개의 버전 도서관 검색

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

Veloc: Towards high performance adaptive asynchronous checkpointing at large scale

B Nicolae, A Moody, E Gonsiorowski… - 2019 IEEE …, 2019 - ieeexplore.ieee.org

Global checkpointing to external storage (eg, a parallel file system) is a common I/O pattern
of many HPC applications. However, given the limited I/O throughput of external storage …

저장 인용 106회 인용 관련 학술자료 전체 7개의 버전

인용

고급 검색

라이브러리에 저장됨

A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GraphP: Reducing communication for PIM-based graph processing with efficient data partition

Tangram: Optimized coarse-grained dataflow for scalable nn accelerators

Graphq: Scalable pim-based graph processing

NERO: A near high-bandwidth memory stencil accelerator for weather prediction modeling

KPart: A hybrid cache partitioning-sharing technique for commodity multicores

Mira: A program-behavior-guided far memory system

Veloc: Towards high performance adaptive asynchronous checkpointing at large scale