Google Učenjak

J Li, J Xu, S Huang, Y Chen, W Li, J Liu, Y Lian… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities across various
fields, from natural language understanding to text generation. Compared to non-generative …

Shrani Navedi Navedeno v 10 virih Sorodni članki Vse različice: 3 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Memory-Centric Computing: Recent Advances in Processing-in-DRAM

O Mutlu, A Olgun, GF Oliveira… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Memory-centric computing aims to enable computation capability in and near all places
where data is generated and stored. As such, it can greatly reduce the large negative …

Shrani Navedi Navedeno v 2 virih Sorodni članki Vse različice: 2

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching

S Yun, K Kyung, J Cho, J Choi, J Kim… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org

Large language models (LLMs) have emerged due to their capability to generate high-
quality content across diverse contexts. To reduce their explosively increasing demands for …

Shrani Navedi Navedeno v 4 virih Sorodni članki Vse različice: 5

PyGim: An Efficient Graph Neural Network Library for Real Processing-In-Memory Architectures

C Giannoula, P Yang, I Fernandez, J Yang… - Proceedings of the …, 2024 - dl.acm.org

Graph Neural Networks (GNNs) are emerging models to analyze graph-structure data. GNN
execution involves both compute-intensive and memory-intensive kernels. The latter kernels …

Shrani Navedi Navedeno v 1 virih Sorodni članki Vse različice: 3

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models

C Guo, F Cheng, Z Du, J Kiessling, J Ku… - IEEE Circuits and …, 2025 - ieeexplore.ieee.org

The rapid development of large language models (LLMs) has significantly transformed the
field of artificial intelligence, demonstrating remarkable capabilities in natural language …

Shrani Navedi Navedeno v 4 virih Sorodni članki Vse različice: 3

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

J Cho, M Kim, H Choi, G Heo… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Recently, there has been an extensive research effort in building efficient large language
model (LLM) inference serving systems. These efforts not only include innovations in the …

Shrani Navedi Navedeno v 3 virih Sorodni članki Vse različice: 6

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

INF^ 2: High-Throughput Generative Inference of Large Language Models using Near-Storage Processing

H Jang, S Noh, C Shin, J Jung, J Song… - arxiv preprint arxiv …, 2025 - arxiv.org

The growing memory and computational demands of large language models (LLMs) for
generative inference present significant challenges for practical deployment. One promising …

Shrani Navedi Sorodni članki Vse različice: 2 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System

Y He, H Mao, C Giannoula, M Sadrosadati… - arxiv preprint arxiv …, 2025 - arxiv.org

Large language models (LLMs) are widely used for natural language understanding and
text generation. An LLM model relies on a time-consuming step called LLM decoding to …

Shrani Navedi Sorodni članki V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

PIM Is All You Need: A CXL-Enabled GPU-Free System for Large Language Model Inference

Y Gu, A Khadem, S Umesh, N Liang, X Servot… - arxiv preprint arxiv …, 2025 - arxiv.org

Large Language Model (LLM) inference uses an autoregressive manner to generate one
token at a time, which exhibits notably lower operational intensity compared to earlier …

Shrani Navedi Sorodni članki V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

GenAI at the Edge: Comprehensive Survey on Empowering Edge Devices

M Navardi, R Aalishah, Y Fu, Y Lin, H Li… - arxiv preprint arxiv …, 2025 - arxiv.org

Generative Artificial Intelligence (GenAI) applies models and algorithms such as Large
Language Model (LLM) and Foundation Model (FM) to generate new data. GenAI, as a …

Shrani Navedi Sorodni članki Vse različice: 2 V obliki HTML

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

AttAcc! Unleashing the power of PIM for batched transformer-based generative model inference

Large language model inference acceleration: A comprehensive hardware perspective

Memory-Centric Computing: Recent Advances in Processing-in-DRAM

Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching

PyGim: An Efficient Graph Neural Network Library for Real Processing-In-Memory Architectures

A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

INF^ 2: High-Throughput Generative Inference of Large Language Models using Near-Storage Processing

PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System

PIM Is All You Need: A CXL-Enabled GPU-Free System for Large Language Model Inference

GenAI at the Edge: Comprehensive Survey on Empowering Edge Devices