Google 학술 검색

S Mostofi, H Falahati, N Mahani… - Proceedings of the 56th …, 2023 - dl.acm.org

Graphics Processing Units (GPUs) utilize memory hierarchy and Thread-Level Parallelism
(TLP) to tolerate off-chip memory latency, which is a significant bottleneck for memory-bound …

저장 인용 5회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Agents of Autonomy: A Systematic Study of Robotics on Modern Hardware

M Bakhshalipour, PB Gibbons - … of the ACM on Measurement and …, 2023 - dl.acm.org

As robots increasingly permeate modern society, it is crucial for the system and hardware
research community to bridge its long-standing gap with robotics. This divide has persisted …

저장 인용 4회 인용 관련 학술자료 전체 3개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Morpheus: Extending the last level cache capacity in GPU systems using idle GPU core resources

S Darabi, M Sadrosadati, N Akbarzadeh… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

Graphics Processing Units (GPUs) are widely-used accelerators for data-parallel
applications. In many GPU applications, GPU memory bandwidth bottlenecks performance …

저장 인용 17회 인용 관련 학술자료 전체 6개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org Full View

Cross-core Data Sharing for Energy-efficient GPUs

H Falahati, M Sadrosadati, Q Xu… - ACM Transactions on …, 2024 - dl.acm.org

Graphics Processing Units (GPUs) are the accelerator of choice in a variety of application
domains, because they can accelerate massively parallel workloads and can be easily …

저장 인용 1회 인용 관련 학술자료

[Free GPT-4]
[DeepSeek]

[PDF] cmu.edu

Tartan: Microarchitecting a Robotic Processor

M Bakhshalipour, PB Gibbons - 2024 ACM/IEEE 51st Annual …, 2024 - ieeexplore.ieee.org

This paper presents Tartan, a CPU architecture designed for a wide range of robotic
applications. Tartan provides architectural support for common robotic kernels, ensuring its …

저장 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Slightly Off-Axis Digital Holography Using a Transmission Grating and GPU-Accelerated Parallel Phase Reconstruction

H Bai, J Chen, L Sun, L Li, J Zhang - Photonics, 2023 - mdpi.com

Slightly off-axis digital holography is proposed using transmission grating to obtain
quantitative phase distribution. The experimental device is based on an improved 4f optical …

저장 인용 2회 인용 관련 학술자료 전체 4개의 버전 저장된 페이지

[Free GPT-4]
[DeepSeek]

[PDF] github.io

SMILE: LLC-based Shared Memory Expansion to Improve GPU Thread Level Parallelism

T Guo, X Huang, K Wu, X Zhang, N **ao - … of the 61st ACM/IEEE Design …, 2024 - dl.acm.org

While designed for massive parallelism, GPUs are frequently suffering from low thread
occupancy and limited data throughput, which are typically attributed to constrained on-chip …

저장 인용 관련 학술자료

[Free GPT-4]
[DeepSeek]

[PDF] preprints.org

Systematic Review of Accelerating Time-Series Biosignal Machine Learning Processes Using GPU Architectures

E Ketola, M Imtiaz - 2024 - preprints.org

Background: Time-series biosignal data, representative of a physiological process, is often
applied to time-sensitive machine learning applications that benefit from acceleration …

저장 인용 관련 학술자료 전체 2개의 버전 HTML 버전

A Bandwidth-Adaptive On-Chip Storage Network Architecture

K Wang, N Yu, D Tian, L Yang… - 2024 9th International …, 2024 - ieeexplore.ieee.org

A scalable bandwidth-adaptive on-chip storage network architecture is proposed to address
the severe data conflict and low bus parallelism in existing multi-level storage, Crossbar …

저장 인용 관련 학술자료

Pomelo: Alternative mechanism of threads communication for accelerating convolution on SIMT based processor

Z Feng, L Yang, Y Zhang - 2024 9th International Conference …, 2024 - ieeexplore.ieee.org

Single Instruction Multiple Thread (SIMT) based processor and parallel model are effective
ways to solve computation problems exist in big data era. Commonly, work load is organized …

저장 인용 관련 학술자료 전체 2개의 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

OSM: Off-chip shared memory for GPUs

Snake: A variable-length chain-based prefetching for gpus

Agents of Autonomy: A Systematic Study of Robotics on Modern Hardware

Morpheus: Extending the last level cache capacity in GPU systems using idle GPU core resources

Cross-core Data Sharing for Energy-efficient GPUs

Tartan: Microarchitecting a Robotic Processor

Slightly Off-Axis Digital Holography Using a Transmission Grating and GPU-Accelerated Parallel Phase Reconstruction

SMILE: LLC-based Shared Memory Expansion to Improve GPU Thread Level Parallelism

Systematic Review of Accelerating Time-Series Biosignal Machine Learning Processes Using GPU Architectures

A Bandwidth-Adaptive On-Chip Storage Network Architecture

Pomelo: Alternative mechanism of threads communication for accelerating convolution on SIMT based processor