A mixed-precision RISC-V processor for extreme-edge DNN inference

G Ottavi, A Garofalo, G Tagliavini… - 2020 IEEE Computer …, 2020 - ieeexplore.ieee.org
Low bit-width Quantized Neural Networks (QNNs) enable deployment of complex machine
learning models on constrained devices such as microcontrollers (MCUs) by reducing their …

A 1.15 TOPS/W, 16-cores parallel ultra-low power cluster with 2b-to-32b fully flexible bit-precision and vector lockstep execution mode

A Garofalo, G Ottavi, A Di Mauro, F Conti… - … 2021-IEEE 47th …, 2021 - ieeexplore.ieee.org
IoT end-nodes require extreme performance and energy efficiency coupled with high
flexibility to deal with the increasing computational requirements and variety of modern near …

Algorithm/Accelerator co-design and co-search for edge AI

X Zhang, Y Li, J Pan, D Chen - IEEE Transactions on Circuits …, 2022 - ieeexplore.ieee.org
The world has seen the great success of deep neural networks (DNNs) in a massive number
of artificial intelligence (AI) applications. However, develo** high-quality AI services to …

HiKonv: High throughput quantized convolution with novel bit-wise management and computation

X Liu, Y Chen, P Ganesh, J Pan… - 2022 27th Asia and …, 2022 - ieeexplore.ieee.org
Quantization for Convolutional Neural Network (CNN) has shown significant progress with
the intention of reducing the cost of computation and storage with low-bitwidth data inputs …

Compressive sensing using iterative hard thresholding with low precision data representation: Theory and applications

NM Gürel, K Kara, A Stojanov, T Smith… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
Modern scientific instruments produce vast amounts of data, which can overwhelm the
processing ability of computer systems. Lossy compression of data is an intriguing solution …

[PDF][PDF] Machine Learning on Manycore CPUs

E Wszola - 2022 - research-collection.ethz.ch
Recent years have seen a rapid emergence of manycore machines and multicore
processors with twenty or more cores. While writing code for the standard CPUs requires …

On linear learning with manycore processors

E Wszola, C Mendler-Dünner, M Jaggi… - 2019 IEEE 26th …, 2019 - ieeexplore.ieee.org
A new generation of manycore processors is on the rise that offers dozens and more cores
on a chip and, in a sense, fuses host processor and accelerator. In this paper we target the …

[PDF][PDF] Building Abstractions for Staged DSLs in Performance-Oriented Program Generators

A Stojanov - 2019 - research-collection.ethz.ch
Develo** high-performance code for numerical domains is challenging, as it requires
hand-in-hand specialization with the continuous evolution of modern hardware. Program …