[PDF][PDF] FPGA-based acceleration for convolutional neural networks on PYNQ-Z2

TV Huynh - Int. J. Comput. Digit. Syst, 2022 - pdfs.semanticscholar.org
Convolutional neural network is now widely used in computer vision and deep learning
applications. The most computeintensive layer in convolutional neural networks is the …

Two distributed arithmetic based high throughput architectures of non-pipelined LMS adaptive filters

MT Khan, MA Alhartomi, S Alzahrani, RA Shaik… - IEEE …, 2022 - ieeexplore.ieee.org
Distributed arithmetic (DA) is an efficient look-up table (LUT) based approach. The
throughput of DA based implementation is limited by the LUT size. This paper presents two …

The Gauss-Seidel fast affine projection algorithm

F Albu, J Kadlec, N Coleman… - IEEE Workshop on Signal …, 2002 - ieeexplore.ieee.org
In this paper we propose a new stable fast affine projection algorithm based on Gauss-
Seidel iterations (GSFAP). We investigate its implementation using the logarithmic number …

A pipelined reduced complexity two-stages parallel LMS structure for adaptive beamforming

G Akkad, A Mansour, BA ElHassan… - … on Circuits and …, 2020 - ieeexplore.ieee.org
In this paper, we propose a reduced complexity parallel least mean square structure (RC-
pLMS) for adaptive beamforming and its pipelined hardware implementation. RC-pLMS is …

DGCNN on FPGA: acceleration of the point cloud classifier using FPGAS

S Jamali Golzar, G Karimian, M Shoaran… - Circuits, Systems, and …, 2023 - Springer
Over the last few years, deep learning on irregular 3D data given its wide range of
applications has become one of the active topics in the field. While field programmable gate …

Clock gating-based effectual realization of stochastic hyperbolic tangent function for deep neural hardware accelerators

G Rajput, V Logashree, KN Biyani… - Circuits, Systems, and …, 2023 - Springer
Comprehensive neural network applications led to the customization of a scheme to
accelerate the computation on ASIC implementation. Hence, the determination of activation …

An efficient implementation for linear convolution with reduced latency in FPGA

D Xue, LS DeBrunner, V DeBrunner, Z Huang… - Electronics …, 2024 - Wiley Online Library
A recently developed linear convolution filter based on Hirschman theory has shown its
advantage in saving computations compared with other convolution filters. Here, the …

[PDF][PDF] Implementation of the least-squares lattice with order and forgetting factor estimation for FPGA

Z Pohl, M Tichy, J Kadlec - EURASIP Journal on Advances in Signal …, 2008 - Springer
A high performance RLS lattice filter with the estimation of an unknown order and forgetting
factor of identified system was developed and implemented as a PCORE coprocessor for …

New proportionate affine projection algorithm

F Albu - Noise Control and Acoustics Division …, 2012 - asmedigitalcollection.asme.org
A new proportionate-type affine projection algorithm with intermittent update of the weight
coefficients is proposed. It takes into account the “history” of the proportionate factors and …

An Optimization Methodology for Designing Hardware-Based Function Evaluation Modules with Reduced Complexity

G González-Díaz-Conti, O Longoria-Gandara… - Circuits, Systems, and …, 2022 - Springer
The evaluation of mathematical functions is a critical task in several hardware designs, and
piecewise polynomial approximation (PPA) is one of the main techniques widely used for …