- Academic Search

J Kim, S Kim, J Choi, J Park, D Kim… - Proceedings of the 50th …, 2023 - dl.acm.org

Fully homomorphic encryption (FHE) is an emerging cryptographic technology that
guarantees the privacy of sensitive user data by enabling direct computations on encrypted …

Save Cite Cited by 48 Related articles All 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] cmu.edu

Bingo spatial data prefetcher

M Bakhshalipour, M Shakerinava… - … Symposium on High …, 2019 - ieeexplore.ieee.org

Applications extensively use data objects with a regular and fixed layout, which leads to the
recurrence of access patterns over memory regions. Spatial data prefetching techniques …

Save Cite Cited by 152 Related articles All 7 versions Free GPT-4 DeepSeek

Evaluation of hardware data prefetchers on server processors

M Bakhshalipour, S Tabaeiaghdaei… - ACM Computing …, 2019 - dl.acm.org

Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …

Save Cite Cited by 36 Related articles

[Free GPT-4]
[DeepSeek]

[PDF] nsf.gov

Gpu-nest: Characterizing energy efficiency of multi-gpu inference servers

A Jahanshahi, HZ Sabzi, C Lau… - IEEE Computer …, 2020 - ieeexplore.ieee.org

Cloud inference systems have recently emerged as a solution to the ever-increasing
integration of AI-powered applications into the smart devices around us. The wide adoption …

Save Cite Cited by 51 Related articles All 6 versions Free GPT-4 DeepSeek

Enhancing server efficiency in the face of killer microseconds

A Mirhosseini, A Sriraman… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org

We are entering an era of “killer microseconds” in data center applications. Killer
microseconds refer to μs-scale “holes” in CPU schedules caused by stalls to access fast I/O …

Save Cite Cited by 54 Related articles All 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] nsf.gov

Blockmaestro: Enabling programmer-transparent task-based execution in gpu systems

AA Abdolrashidi, HA Esfeden… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

As modern GPU workloads grow in size and complexity, there is an ever-increasing demand
for GPU computational power. Emerging workloads contain hundreds or thousands of GPU …

Save Cite Cited by 19 Related articles All 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] nsf.gov

BOW: Breathing operand windows to exploit bypassing in GPUs

HA Esfeden, A Abdolrashidi, S Rahman… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org

The Register File (RF) is a critical structure in Graphics Processing Units (GPUs) responsible
for a large portion of the area and power. To simplify the architecture of the RF, it is …

Save Cite Cited by 21 Related articles All 11 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

OSM: Off-chip shared memory for GPUs

S Darabi, E Yousefzadeh-Asl-Miandoab… - … on Parallel and …, 2022 - ieeexplore.ieee.org

Graphics Processing Units (GPUs) employ a shared memory, a software-managed cache for
programmers, in each streaming multiprocessor to accelerate data sharing among the …

Save Cite Cited by 11 Related articles All 3 versions Free GPT-4 DeepSeek

Ready: A fine-grained multithreading overlay framework for modern cpu-fpga dataflow applications

LBD Silva, R Ferreira, M Canesche… - ACM Transactions on …, 2019 - dl.acm.org

In this work, we propose a framework called REconfigurable Accelerator DeploY (READY),
the first framework to support polynomial runtime map** of dataflow applications in high …

Save Cite Cited by 26 Related articles All 2 versions Free GPT-4 DeepSeek

High performance and power efficient accelerator for cloud inference

J Yao, H Zhou, Y Zhang, Y Li, C Feng… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Facing the growing complexity of Deep Neural Networks (DNNs), high-performance and
power-efficient AI accelerators are desired to provide effective and affordable cloud …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 DeepSeek

Create alert

Cite

Advanced search

Saved to My library

Corf: Coalescing operand register file for gpus

SHARP: A short-word hierarchical accelerator for robust and practical fully homomorphic encryption

Bingo spatial data prefetcher

Evaluation of hardware data prefetchers on server processors

Gpu-nest: Characterizing energy efficiency of multi-gpu inference servers

Enhancing server efficiency in the face of killer microseconds

Blockmaestro: Enabling programmer-transparent task-based execution in gpu systems

BOW: Breathing operand windows to exploit bypassing in GPUs

OSM: Off-chip shared memory for GPUs

Ready: A fine-grained multithreading overlay framework for modern cpu-fpga dataflow applications

High performance and power efficient accelerator for cloud inference