Testability and dependability of AI hardware: Survey, trends, challenges, and perspectives
Hardware realization of artificial intelligence (AI) requires new design styles and even
underlying technologies than those used in traditional digital processors or logic circuits …
underlying technologies than those used in traditional digital processors or logic circuits …
Dependable dnn accelerator for safety-critical systems: A review on the aging perspective
In the modern era, artificial intelligence (AI) and deep learning (DL) seamlessly integrate into
various spheres of our daily lives. These cutting-edge disciplines have given rise to …
various spheres of our daily lives. These cutting-edge disciplines have given rise to …
A low-cost fault corrector for deep neural networks through range restriction
The adoption of deep neural networks (DNNs) in safety-critical domains has engendered
serious reliability concerns. A prominent example is hardware transient faults that are …
serious reliability concerns. A prominent example is hardware transient faults that are …
Understanding and mitigating hardware failures in deep learning training systems
Y He, M Hutton, S Chan, R De Gruijl… - Proceedings of the 50th …, 2023 - dl.acm.org
Deep neural network (DNN) training workloads are increasingly susceptible to hardware
failures in datacenters. For example, Google experienced" mysterious, difficult to identify …
failures in datacenters. For example, Google experienced" mysterious, difficult to identify …
ByteTransformer: A high-performance transformer boosted for variable-length inputs
Transformers have become keystone models in natural language processing over the past
decade. They have achieved great popularity in deep learning applications, but the …
decade. They have achieved great popularity in deep learning applications, but the …
Towards energy-efficient and secure edge AI: A cross-layer framework ICCAD special session paper
The security and privacy concerns along with the amount of data that is required to be
processed on regular basis has pushed processing to the edge of the computing systems …
processed on regular basis has pushed processing to the edge of the computing systems …
Distserve: Disaggregating prefill and decoding for goodput-optimized large language model serving
DistServe improves the performance of large language models (LLMs) serving by
disaggregating the prefill and decoding computation. Existing LLM serving systems colocate …
disaggregating the prefill and decoding computation. Existing LLM serving systems colocate …
Exploring Winograd convolution for cost-effective neural network fault tolerance
Winograd is generally utilized to optimize convolution performance and computational
efficiency because of the reduced multiplication operations, but the reliability issues brought …
efficiency because of the reduced multiplication operations, but the reliability issues brought …
Soft error tolerant convolutional neural networks on FPGAs with ensemble learning
Z Gao, H Zhang, Y Yao, J **ao, S Zeng… - … Transactions on Very …, 2022 - ieeexplore.ieee.org
Convolutional neural networks (CNNs) are widely used in computer vision and natural
language processing. Field-programmable gate arrays (FPGAs) are popular accelerators for …
language processing. Field-programmable gate arrays (FPGAs) are popular accelerators for …
Improving fault tolerance for reliable DNN using boundary-aware activation
In this article, we approach to construct reliable deep neural networks (DNNs) for safety-
critical artificial intelligent applications. We propose to modify rectified linear unit (ReLU), a …
critical artificial intelligent applications. We propose to modify rectified linear unit (ReLU), a …