A review of the optimal design of neural networks based on FPGA
C Wang, Z Luo - Applied Sciences, 2022 - mdpi.com
Deep learning based on neural networks has been widely used in image recognition,
speech recognition, natural language processing, automatic driving, and other fields and …
speech recognition, natural language processing, automatic driving, and other fields and …
A survey and taxonomy of FPGA-based deep learning accelerators
Deep learning, the fastest growing segment of Artificial Neural Network (ANN), has led to the
emergence of many machine learning applications and their implementation across multiple …
emergence of many machine learning applications and their implementation across multiple …
Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks
Deep Neural Networks (DNNs) have reinvigorated real-world applications that rely on
learning patterns of data and are permeating into different industries and markets. Cloud …
learning patterns of data and are permeating into different industries and markets. Cloud …
FlexCNN: An end-to-end framework for composing CNN accelerators on FPGA
With reduced data reuse and parallelism, recent convolutional neural networks (CNNs)
create new challenges for FPGA acceleration. Systolic arrays (SAs) are efficient, scalable …
create new challenges for FPGA acceleration. Systolic arrays (SAs) are efficient, scalable …
Accelerating attention through gradient-based learned runtime pruning
Self-attention is a key enabler of state-of-art accuracy for various transformer-based Natural
Language Processing models. This attention mechanism calculates a correlation score for …
Language Processing models. This attention mechanism calculates a correlation score for …
Optimizing CNN-based segmentation with deeply customized convolutional and deconvolutional architectures on FPGA
Convolutional Neural Networks--(CNNs) based algorithms have been successful in solving
image recognition problems, showing very large accuracy improvement. In recent years …
image recognition problems, showing very large accuracy improvement. In recent years …
Memristive GAN in analog
Abstract Generative Adversarial Network (GAN) requires extensive computing resources
making its implementation in edge devices with conventional microprocessor hardware a …
making its implementation in edge devices with conventional microprocessor hardware a …
Uni-OPU: An FPGA-Based Uniform Accelerator for Convolutional and Transposed Convolutional Networks
In this article, we design the first full software/hardware stack, called Uni-OPU, for an efficient
uniform hardware acceleration of different types of transposed convolutional (TCONV) …
uniform hardware acceleration of different types of transposed convolutional (TCONV) …
Sparse attention acceleration with synergistic in-memory pruning and on-chip recomputation
As its core computation, a self-attention mechanism gauges pairwise correlations across the
entire input sequence. Despite favorable performance, calculating pairwise correlations is …
entire input sequence. Despite favorable performance, calculating pairwise correlations is …
GANPU: An energy-efficient multi-DNN training processor for GANs with speculative dual-sparsity exploitation
This article presents generative adversarial network processing unit (GANPU), an energy-
efficient multiple deep neural network (DNN) training processor for GANs. It enables on …
efficient multiple deep neural network (DNN) training processor for GANs. It enables on …