FBLAS: Streaming linear algebra on FPGA

T De Matteis, J de Fine Licht… - … conference for high …, 2020 - ieeexplore.ieee.org
Spatial computing architectures pose an attractive alternative to mitigate control and data
movement overheads typical of load-store architectures. In practice, these devices are rarely …

MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine

E Taka, A Arora, KC Wu… - … Conference on Field …, 2023 - ieeexplore.ieee.org
The increasing computational and memory requirements of Deep Learning (DL) workloads
has led to outstanding innovations in hardware architectures. An archetype of such …

Approximate similarity search with faiss framework using fpgas on the cloud

D Danopoulos, C Kachris, D Soudris - International Conference on …, 2019 - Springer
Abstract Machine Learning algorithms, such as classification and clustering techniques,
have gained significant traction over the last years because they are vital to many real-world …

Artificial neural network and accelerator co-design using evolutionary algorithms

P Colangelo, O Segal, A Speicher… - 2019 IEEE High …, 2019 - ieeexplore.ieee.org
Multilayer feed-forward Artificial Neural Networks (ANNs) are universal function
approximators capable of modeling measurable functions to any desired degree of …

[HTML][HTML] A highly parameterizable framework for conditional restricted Boltzmann machine based workloads accelerated with FPGAs and OpenCL

Z Jakšić, N Cadenelli, DB Prats, J Polo… - Future Generation …, 2020 - Elsevier
Abstract Conditional Restricted Boltzmann Machine (CRBM) is a promising candidate for a
multidimensional system modeling that can learn a probability distribution over a set of data …

An accelerated edge cloud system for energy data stream processing based on adaptive incremental deep learning scheme

SH Kim, C Lee, CH Youn - IEEE Access, 2020 - ieeexplore.ieee.org
As smart metering technology evolves, power suppliers can make low-cost, low-risk
estimation of customer-side power consumption by analyzing energy demand data collected …

Efficient 8-bit Matrix Multiplication on Intel Agilex-5 FPGAs

S Gribok, B Pasca - 2024 IEEE 32nd Annual International …, 2024 - ieeexplore.ieee.org
Matrix multiplication is a fundamental operation in many fields including artificial intelligence
and machine learning, and it often requires significant computational resources. FPGAs …

Fpga acceleration of approximate knn indexing on high-dimensional vectors

D Danopoulos, C Kachris… - 2019 14th International …, 2019 - ieeexplore.ieee.org
Accurate and efficient Machine Learning algorithms are of vital importance to many
problems, especially on classification or clustering tasks. One the most important algorithms …

Nengofpga: an fpga backend for the nengo neural simulator

B Morcos - 2019 - uwspace.uwaterloo.ca
Low-power, high-speed neural networks are critical for providing deployable embedded AI
applications at the edge. We describe a **linx FPGA implementation of Neural Engineering …

Evaluations of OpenCL-written tsunami simulation on FPGA and comparison with GPU implementation

F Kono, N Nakasato, K Hayashi, A Vazhenin… - The Journal of …, 2018 - Springer
When a tsunami occurred on a sea area, prediction of its arrival time is critical for evacuating
people from the coastal area. There are many problems related to tsunami to be solved for …