A 16-nm soc for noise-robust speech and nlp edge ai inference with bayesian sound source separation and attention-based dnns

T Tambe, EY Yang, GG Ko, Y Chai… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org
The proliferation of personal artificial intelligence (AI)-assistant technologies with speech-
based conversational AI interfaces is driving the exponential growth in the consumer Internet …

AI accelerator on IBM Telum processor: Industrial product

C Lichtenau, A Buyuktosunoglu, R Bertran… - Proceedings of the 49th …, 2022 - dl.acm.org
IBM Telum is the next generation processor chip for IBM Z and LinuxONE systems. The
Telum design is focused on enterprise class workloads and it achieves over 40% per socket …

A high-density and reconfigurable SRAM-based digital compute-in-memory macro for low-power AI chips

C Zhang, M Wang, Y Mai, C Tang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
This brief presents a high-density and configurable digital SRAM-based compute-in-memory
(CIM) macro that performs multiply-and-accumulation (MAC) operations for low-power …

MiniFloat-NN and ExSdotp: An ISA extension and a modular open hardware unit for low-precision training on RISC-V cores

L Bertaccini, G Paulin, T Fischer… - 2022 IEEE 29th …, 2022 - ieeexplore.ieee.org
Low-precision formats have recently driven major breakthroughs in neural network (NN)
training and inference by reducing the memory footprint of the NN models and improving the …

An efficient training accelerator for transformers with hardware-algorithm co-optimization

H Shao, J Lu, M Wang, Z Wang - IEEE Transactions on Very …, 2023 - ieeexplore.ieee.org
Transformers have achieved significant success in deep learning, and training Transformers
efficiently on resource-constrained platforms has been attracting continuous attention for …