Programming and synthesis for software-defined FPGA acceleration: status and future prospects

YH Lai, E Ustun, S **ang, Z Fang, H Rong… - ACM Transactions on …, 2021 - dl.acm.org
FPGA-based accelerators are increasingly popular across a broad range of applications,
because they offer massive parallelism, high energy efficiency, and great flexibility for …

ThunderGP: HLS-based graph processing framework on FPGAs

X Chen, H Tan, Y Chen, B He, WF Wong… - The 2021 ACM/SIGDA …, 2021 - dl.acm.org
FPGA has been an emerging computing infrastructure in datacenters benefiting from
features of fine-grained parallelism, energy efficiency, and reconfigurability. Meanwhile …

Towards a uniform template-based architecture for accelerating 2D and 3D CNNs on FPGA

J Shen, Y Huang, Z Wang, Y Qiao, M Wen… - Proceedings of the 2018 …, 2018 - dl.acm.org
Three-dimensional convolutional neural networks (3D CNNs) are used efficiently in many
computer vision applications. Most previous work in this area has concentrated only on …

TGPA: Tile-grained pipeline architecture for low latency CNN inference

X Wei, Y Liang, X Li, CH Yu, P Zhang… - 2018 IEEE/ACM …, 2018 - ieeexplore.ieee.org
FPGAs are more and more widely used as reconfigurable hardware accelerators for
applications leveraging convolutional neural networks (CNNs) in recent years. Previous …

FPGA sharing in the cloud: a comprehensive analysis

J Guo, L Zhang, J Romero Hung, C Li, J Zhao… - Frontiers of Computer …, 2023 - Springer
Cloud vendors are actively adopting FPGAs into their infrastructures for enhancing
performance and efficiency. As cloud services continue to evolve, FPGA (field …

Automated accelerator generation and optimization with composable, parallel and pipeline architecture

J Cong, P Wei, CH Yu, P Zhang - Proceedings of the 55th Annual Design …, 2018 - dl.acm.org
CPU-FPGA heterogeneous architectures feature flexible acceleration of many workloads to
advance computational capabilities and energy efficiency in today's datacenters. This …

FANS: FPGA-accelerated near-storage sorting

W Qiao, J Oh, L Guo, MCF Chang… - 2021 IEEE 29th Annual …, 2021 - ieeexplore.ieee.org
Large-scale sorting is always an important yet demanding task for data center applications.
In addition to powerful processing capability, high-performance sorting system requires …

COSMOS: Coordination of high-level synthesis and memory optimization for hardware accelerators

L Piccolboni, P Mantovani, GD Guglielmo… - ACM Transactions on …, 2017 - dl.acm.org
Hardware accelerators are key to the efficiency and performance of system-on-chip (SoC)
architectures. With high-level synthesis (HLS), designers can easily obtain several …

Demystifying the memory system of modern datacenter FPGAs for software programmers through microbenchmarking

A Lu, Z Fang, W Liu, L Shannon - The 2021 ACM/SIGDA International …, 2021 - dl.acm.org
With the public availability of FPGAs from major cloud service providers like AWS, Alibaba,
and Nimbix, hardware and software developers can now easily access FPGA platforms …

Automatic creation of high-bandwidth memory architectures from domain-specific languages: The case of computational fluid dynamics

S Soldavini, K Friebel, M Tibaldi, G Hempel… - ACM Transactions on …, 2023 - dl.acm.org
Numerical simulations can help solve complex problems. Most of these algorithms are
massively parallel and thus good candidates for FPGA acceleration thanks to spatial …