A 28-nm 135.19 TOPS/W Bootstrapped-SRAM Compute-in-Memory Accelerator With Layer-Wise Precision and Sparsity
W Mao, D Liu, H Zhou, F Li, K Li, Q Wu… - … on Circuits and …, 2024 - ieeexplore.ieee.org
Artificial intelligence (AI) edge devices demand high energy efficiency as well as inference
accuracy. SRAM-based compute-in-memory (CIM) accelerators have great potential for …
accuracy. SRAM-based compute-in-memory (CIM) accelerators have great potential for …
LAMPS: A Layer-wised Mixed-Precision-and-Sparsity Accelerator for NAS-Optimized CNNs on FPGA
S Yang, C Ding, M Huang, K Li, C Li… - 2024 IEEE 32nd …, 2024 - ieeexplore.ieee.org
The increasing model size and computation load of convolutional neural networks (CNN)
pose a grand challenge to deploy CNN models on edge computing devices. To further …
pose a grand challenge to deploy CNN models on edge computing devices. To further …
A Tensor-Train Decomposition based Compression of LLMs on Group Vector Systolic Accelerator
S Huang, T Wang, A Li, A Shen, K Li, K Jiang… - arxiv preprint arxiv …, 2025 - arxiv.org
Large language models (LLMs) are both storage-intensive and computation-intensive,
posing significant challenges when deployed on resource-constrained hardware. As linear …
posing significant challenges when deployed on resource-constrained hardware. As linear …
FMTT: Fused Multi-Head Transformer with Tensor-Compression for 3D Point Clouds Detection on Edge Devices
Z Wei, T Wang, C Ding, B Wang, Z Guan… - … , Automation & Test …, 2024 - ieeexplore.ieee.org
The real-time detection of 3D objects represents a grand challenge on edge devices.
Existing 3D point clouds models are over-parameterized with heavy computation load. This …
Existing 3D point clouds models are over-parameterized with heavy computation load. This …