Programming and synthesis for software-defined FPGA acceleration: status and future prospects

YH Lai, E Ustun, S **ang, Z Fang, H Rong… - ACM Transactions on …, 2021 - dl.acm.org
FPGA-based accelerators are increasingly popular across a broad range of applications,
because they offer massive parallelism, high energy efficiency, and great flexibility for …

FPGA sharing in the cloud: a comprehensive analysis

J Guo, L Zhang, J Romero Hung, C Li, J Zhao… - Frontiers of Computer …, 2023 - Springer
Cloud vendors are actively adopting FPGAs into their infrastructures for enhancing
performance and efficiency. As cloud services continue to evolve, FPGA (field …

Towards a uniform template-based architecture for accelerating 2D and 3D CNNs on FPGA

J Shen, Y Huang, Z Wang, Y Qiao, M Wen… - Proceedings of the 2018 …, 2018 - dl.acm.org
Three-dimensional convolutional neural networks (3D CNNs) are used efficiently in many
computer vision applications. Most previous work in this area has concentrated only on …

TGPA: Tile-grained pipeline architecture for low latency CNN inference

X Wei, Y Liang, X Li, CH Yu, P Zhang… - 2018 IEEE/ACM …, 2018 - ieeexplore.ieee.org
FPGAs are more and more widely used as reconfigurable hardware accelerators for
applications leveraging convolutional neural networks (CNNs) in recent years. Previous …

Automated accelerator generation and optimization with composable, parallel and pipeline architecture

J Cong, P Wei, CH Yu, P Zhang - Proceedings of the 55th Annual Design …, 2018 - dl.acm.org
CPU-FPGA heterogeneous architectures feature flexible acceleration of many workloads to
advance computational capabilities and energy efficiency in today's datacenters. This …

FANS: FPGA-accelerated near-storage sorting

W Qiao, J Oh, L Guo, MCF Chang… - 2021 IEEE 29th Annual …, 2021 - ieeexplore.ieee.org
Large-scale sorting is always an important yet demanding task for data center applications.
In addition to powerful processing capability, high-performance sorting system requires …

COSMOS: Coordination of high-level synthesis and memory optimization for hardware accelerators

L Piccolboni, P Mantovani, GD Guglielmo… - ACM Transactions on …, 2017 - dl.acm.org
Hardware accelerators are key to the efficiency and performance of system-on-chip (SoC)
architectures. With high-level synthesis (HLS), designers can easily obtain several …

CHIP-KNN: A configurable and high-performance k-nearest neighbors accelerator on cloud FPGAs

A Lu, Z Fang, N Farahpour… - … Conference on Field …, 2020 - ieeexplore.ieee.org
The k-nearest neighbors (KNN) algorithm is an essential algorithm in many applications,
such as similarity search, image classification, and database query. With the rapid growth in …

HLS-based optimization and design space exploration for applications with variable loop bounds

Y Choi, J Cong - 2018 IEEE/ACM International Conference on …, 2018 - ieeexplore.ieee.org
In order to further increase the productivity of field-programmable gate array (FPGA)
programmers, several design space exploration (DSE) frameworks for high-level synthesis …

Demystifying the memory system of modern datacenter FPGAs for software programmers through microbenchmarking

A Lu, Z Fang, W Liu, L Shannon - The 2021 ACM/SIGDA International …, 2021 - dl.acm.org
With the public availability of FPGAs from major cloud service providers like AWS, Alibaba,
and Nimbix, hardware and software developers can now easily access FPGA platforms …