SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators

M Odema, L Chen, H Kwon… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Emerging multi-model workloads with heavy models like recent large language models
significantly increased the compute and memory demands on hardware. To address such …

HyDe: A Hybrid PCM/FeFET/SRAM Device-search for Optimizing Area and Energy-efficiencies in Analog IMC Platforms

A Bhattacharjee, A Moitra… - IEEE Journal on Emerging …, 2023 - ieeexplore.ieee.org
Today, there are a plethora of In-Memory Computing (IMC) devices-SRAMs, PCMs &
FeFETs, that emulate convolutions on crossbar-arrays with high throughput. Each IMC …

HISIM: Analytical Performance Modeling and Design Space Exploration of 2.5 D/3D Integration for AI Computing

Z Wang, PS Nalla, J Sun, AA Goksoy… - … on Computer-Aided …, 2025 - ieeexplore.ieee.org
Monolithic designs face significant fabrication cost and data movement challenges,
especially when executing complex and diverse AI models. Advanced 2.5 D/3D packaging …

Exploiting 2.5 D/3D Heterogeneous Integration for AI Computing

Z Wang, J Sun, A Goksoy, SK Mandal… - 2024 29th Asia and …, 2024 - ieeexplore.ieee.org
The evolution of AI algorithms has not only revolutionized many application domains, but
also posed tremendous challenges on the hardware platform. Advanced packaging …

Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception

M Odema, L Chen, H Kwon, MAA Faruque - arxiv preprint arxiv …, 2024 - arxiv.org
We study the application of emerging chiplet-based Neural Processing Units to accelerate
vehicular AI perception workloads in constrained automotive settings. The motivation stems …

3D In-Sensor Computing for Real-Time DVS Data Compression: 65nm Hardware-Algorithm Co-Design

GR Nair, PS Nalla, G Krishnan, J Oh… - IEEE Solid-State …, 2024 - ieeexplore.ieee.org
Traditional IO links are insufficient to transport high volume of image sensor data, under
stringent power and latency constraints. To address this, we demonstrate a low latency, low …

Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

M Odema, H Kwon, MAA Faruque - arxiv preprint arxiv:2312.09401, 2023 - arxiv.org
To address increasing compute demand from recent multi-model workloads with heavy
models like large language models, we propose to deploy heterogeneous chiplet-based …

A 16nm Heterogeneous Accelerator for Energy-Efficient Sparse and Dense AI Computing

G Raveendran Nair, F Jiang, J Zhang… - Proceedings of the 29th …, 2024 - dl.acm.org
Artificial intelligence (AI) has evolved from dense Deep Neural Networks (DNNs) toward a
diverse set of models, such as sparse graph convolutional neural networks (GCNs). These …

[HTML][HTML] End-to-End Benchmarking of Chiplet-Based In-Memory Computing

G Krishnan, SK Mandal, AA Goksoy… - Neuromorphic …, 2023 - intechopen.com
Abstract In-memory computing (IMC)-based hardware reduces latency and energy
consumption for compute-intensive machine learning (ML) applications. Several …

Benchmarking Heterogeneous Integration with 2.5 D/3D Interconnect Modeling

Z Wang, J Sun, A Goksoy, SK Mandal… - 2023 IEEE 15th …, 2023 - ieeexplore.ieee.org
Current monolithic designs face significant challenges in terms of silicon area, fabrication
cost, and data movement especially when dealing with increasingly complex and diverse AI …