Google Acadèmic

YH Lai, E Ustun, S **ang, Z Fang, H Rong… - ACM Transactions on …, 2021 - dl.acm.org

FPGA-based accelerators are increasingly popular across a broad range of applications,
because they offer massive parallelism, high energy efficiency, and great flexibility for …

Desa Cita Citat per 48 Articles relacionats Totes les 3 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

AutoSA: A polyhedral compiler for high-performance systolic arrays on FPGA

J Wang, L Guo, J Cong - The 2021 ACM/SIGDA International Symposium …, 2021 - dl.acm.org

While systolic array architectures have the potential to deliver tremendous performance, it is
notoriously challenging to customize an efficient systolic array processor for a target …

Desa Cita Citat per 150 Articles relacionats Totes les 5 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture

J Zhuang, J Lau, H Ye, Z Yang, Y Du, J Lo… - Proceedings of the …, 2023 - dl.acm.org

Dense matrix multiply (MM) serves as one of the most heavily used kernels in deep learning
applications. To cope with the high computation demands of these applications …

Desa Cita Citat per 49 Articles relacionats Totes les 8 versions Free GPT-4

[Free GPT-4]

[PDF] nsf.gov

PolySA: Polyhedral-based systolic array auto-compilation

J Cong, J Wang - 2018 IEEE/ACM International Conference on …, 2018 - ieeexplore.ieee.org

Automatic systolic array generation has long been an interesting topic due to the need to
reduce the lengthy development cycles of manual designs. Existing automatic systolic array …

Desa Cita Citat per 147 Articles relacionats Totes les 7 versions Free GPT-4

[Free GPT-4]

[PDF] nsf.gov

SODA: Stencil with optimized dataflow architecture

Y Chi, J Cong, P Wei, P Zhou - 2018 IEEE/ACM International …, 2018 - ieeexplore.ieee.org

Stencil computation is one of the most important kernels in many application domains such
as image processing, solving partial differential equations, and cellular automata. Many of …

Desa Cita Citat per 137 Articles relacionats Totes les 9 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

AutoBridge: Coupling coarse-grained floorplanning and pipelining for high-frequency HLS design on multi-die FPGAs

L Guo, Y Chi, J Wang, J Lau, W Qiao, E Ustun… - The 2021 ACM/SIGDA …, 2021 - dl.acm.org

Despite an increasing adoption of high-level synthesis (HLS) for its design productivity
advantages, there remains a significant gap in the achievable clock frequency between an …

Desa Cita Citat per 67 Articles relacionats Totes les 8 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

RapidStream: parallel physical implementation of FPGA HLS designs

L Guo, P Maidee, Y Zhou, C Lavin, J Wang… - Proceedings of the …, 2022 - dl.acm.org

FPGAs require a much longer compilation cycle than conventional computing platforms like
CPUs. In this paper, we shorten the overall compilation time by co-optimizing the HLS …

Desa Cita Citat per 46 Articles relacionats Totes les 8 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Sextans: A streaming accelerator for general-purpose sparse-matrix dense-matrix multiplication

L Song, Y Chi, A Sohrabizadeh, Y Choi, J Lau… - Proceedings of the …, 2022 - dl.acm.org

Sparse-Matrix Dense-Matrix multiplication (SpMM) is the key operator for a wide range of
applications including scientific computing, graph processing, and deep learning …

Desa Cita Citat per 45 Articles relacionats Totes les 9 versions Free GPT-4

[Free GPT-4]

[HTML] nih.gov

Extending high-level synthesis for task-parallel programs

Y Chi, L Guo, J Lau, Y Choi, J Wang… - 2021 IEEE 29th Annual …, 2021 - ieeexplore.ieee.org

C/C++/OpenCL-based high-level synthesis (HLS) becomes more and more popular for field-
programmable gate array (FPGA) accelerators in many application domains in recent years …

Desa Cita Citat per 61 Articles relacionats Totes les 13 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

CHARM 2.0: Composing Heterogeneous Accelerators for Deep Learning on Versal ACAP Architecture

J Zhuang, J Lau, H Ye, Z Yang, S Ji, J Lo… - ACM Transactions on …, 2024 - dl.acm.org

Dense matrix multiply (MM) serves as one of the most heavily used kernels in deep learning
applications. To cope with the high computation demands of these applications …

Desa Cita Citat per 3 Articles relacionats Totes les 7 versions Free GPT-4

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Latte: Locality aware transformation for high-level synthesis

Programming and synthesis for software-defined FPGA acceleration: status and future prospects

AutoSA: A polyhedral compiler for high-performance systolic arrays on FPGA

CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture

PolySA: Polyhedral-based systolic array auto-compilation

SODA: Stencil with optimized dataflow architecture

AutoBridge: Coupling coarse-grained floorplanning and pipelining for high-frequency HLS design on multi-die FPGAs

RapidStream: parallel physical implementation of FPGA HLS designs

Sextans: A streaming accelerator for general-purpose sparse-matrix dense-matrix multiplication

Extending high-level synthesis for task-parallel programs

CHARM 2.0: Composing Heterogeneous Accelerators for Deep Learning on Versal ACAP Architecture