AutoSA: A polyhedral compiler for high-performance systolic arrays on FPGA
While systolic array architectures have the potential to deliver tremendous performance, it is
notoriously challenging to customize an efficient systolic array processor for a target …
notoriously challenging to customize an efficient systolic array processor for a target …
AutoBridge: Coupling coarse-grained floorplanning and pipelining for high-frequency HLS design on multi-die FPGAs
Despite an increasing adoption of high-level synthesis (HLS) for its design productivity
advantages, there remains a significant gap in the achievable clock frequency between an …
advantages, there remains a significant gap in the achievable clock frequency between an …
Hbm connect: High-performance hls interconnect for fpga hbm
With the recent release of High Bandwidth Memory (HBM) based FPGA boards, developers
can now exploit unprecedented external memory bandwidth. This allows more memory …
can now exploit unprecedented external memory bandwidth. This allows more memory …
Gme: Gpu-based microarchitectural extensions to accelerate homomorphic encryption
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without
decrypting it. FHE has garnered significant attention over the past decade as it supports …
decrypting it. FHE has garnered significant attention over the past decade as it supports …
Sextans: A streaming accelerator for general-purpose sparse-matrix dense-matrix multiplication
Sparse-Matrix Dense-Matrix multiplication (SpMM) is the key operator for a wide range of
applications including scientific computing, graph processing, and deep learning …
applications including scientific computing, graph processing, and deep learning …
Fleetrec: Large-scale recommendation inference on hybrid gpu-fpga clusters
We present FleetRec, a high-performance and scalable recommendation inference system
within tight latency constraints. FleetRec takes advantage of heterogeneous hardware …
within tight latency constraints. FleetRec takes advantage of heterogeneous hardware …
Accelerating SSSP for power-law graphs
The single-source shortest path (SSSP) problem is one of the most important and well-
studied graph problems widely used in many application domains, such as road navigation …
studied graph problems widely used in many application domains, such as road navigation …
Shuhai: A tool for benchmarking high bandwidth memory on FPGAs
FPGAs are starting to incorporate High Bandwidth Memory (HBM) to both reduce the
memory bandwidth bottleneck encountered in some applications and to provide more …
memory bandwidth bottleneck encountered in some applications and to provide more …
Automatic creation of high-bandwidth memory architectures from domain-specific languages: The case of computational fluid dynamics
Numerical simulations can help solve complex problems. Most of these algorithms are
massively parallel and thus good candidates for FPGA acceleration thanks to spatial …
massively parallel and thus good candidates for FPGA acceleration thanks to spatial …
A survey of FPGA optimization methods for data center energy efficiency
M Tibaldi, C Pilato - IEEE Transactions on Sustainable …, 2023 - ieeexplore.ieee.org
This article provides a survey of academic literature about field programmable gate array
(FPGA) and their utilization for energy efficiency acceleration in data centers. The goal is to …
(FPGA) and their utilization for energy efficiency acceleration in data centers. The goal is to …