Reflections on 10 years of FloPoCo

F de Dinechin - 2019 IEEE 26th Symposium on Computer …, 2019 - ieeexplore.ieee.org
The FloPoCo open-source arithmetic core generator project started modestly in 2008 with a
few parametric floating point cores. It has since then evolved to become a framework for …

On the design of logarithmic multiplier using radix-4 booth encoding

R Pilipović, P Bulić - IEEE access, 2020 - ieeexplore.ieee.org
This paper proposes an energy-efficient approximate multiplier which combines radix-4
Booth encoding and logarithmic product approximation. Additionally, a datapath pruning …

Next generation arithmetic for edge computing

A Guntoro, C De La Parra, F Merchant… - … , Automation & Test …, 2020 - ieeexplore.ieee.org
Arithmetic is a key component and is ubiquitous in today's digital world, ranging from
embedded to high-performance computing systems. With machine learning at the fore in a …

Towards globally optimal design of multipliers for FPGAs

A Böttcher, M Kumm - IEEE Transactions on Computers, 2023 - ieeexplore.ieee.org
The design of a multiplier typically consists of three steps:(1) partial product generation,(2)
compressor tree design and (3) the selection of the final adder. Conventionally, these three …

High-efficiency Compressor Trees for Latest AMD FPGAs

K Hoßfeld, HJ Damsgaard, J Nurmi, M Blott… - ACM Transactions on …, 2024 - dl.acm.org
High-fan-in dot product computations are ubiquitous in highly relevant application domains,
such as signal processing and machine learning. Particularly, the diverse set of data formats …

Optimizing bit-serial matrix multiplication for reconfigurable computing

Y Umuroglu, D Conficconi, L Rasnayake… - ACM Transactions on …, 2019 - dl.acm.org
Matrix-matrix multiplication is a key computational kernel for numerous applications in
science and engineering, with ample parallelism and data locality that lends itself well to …

Reconfigurable convolutional kernels for neural networks on FPGAs

M Hardieck, M Kumm, K Möller, P Zipf - Proceedings of the 2019 ACM …, 2019 - dl.acm.org
Convolutional neural networks (CNNs) gained great success in machine learning
applications and much attention was paid to their acceleration on field programmable gate …

Karatsuba with rectangular multipliers for FPGAs

M Kumm, O Gustafsson, F De Dinechin… - 2018 IEEE 25th …, 2018 - ieeexplore.ieee.org
This work presents an extension of Karatsuba's method to efficiently use rectangular
multipliers as a base for larger multipliers. The rectangular multipliers that motivate this work …

On the rtl implementation of finn matrix vector unit

SA Alam, D Gregg, G Gambardella, T Preusser… - ACM Transactions on …, 2023 - dl.acm.org
Field-programmable gate array (FPGA)–based accelerators are becoming increasingly
popular for deep neural network (DNN) inference due to their ability to scale performance …

Low-precision logarithmic arithmetic for neural network accelerators

M Christ, F De Dinechin, F Pétrot - 2022 IEEE 33rd …, 2022 - ieeexplore.ieee.org
Resource requirements for hardware acceleration of neural networks inference is
notoriously high, both in terms of computation and storage. One way to mitigate this issue is …