Google Acadèmic

S Hong, S Moon, J Kim, S Lee, M Kim… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

Transformer is a deep learning language model widely used for natural language
processing (NLP) services in datacenters. Among transformer models, Generative …

Desa Cita Citat per 69 Articles relacionats Totes les 10 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{CXL-ANNS}:{Software-Hardware} collaborative memory disaggregation and computation for {Billion-Scale} approximate nearest neighbor search

J Jang, H Choi, H Bae, S Lee, M Kwon… - 2023 USENIX Annual …, 2023 - usenix.org

We propose CXL-ANNS, a software-hardware collaborative approach to enable highly
scalable approximate nearest neighbor search (ANNS) services. To this end, we first …

Desa Cita Citat per 33 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] firoozshahian.com

Mtia: First generation silicon targeting meta's recommendation systems

A Firoozshahian, J Coburn, R Levenstein… - Proceedings of the 50th …, 2023 - dl.acm.org

Meta has traditionally relied on using CPU-based servers for running inference workloads,
specifically Deep Learning Recommendation Models (DLRM), but the increasing compute …

Desa Cita Citat per 33 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Magma: An optimization framework for map** multiple dnns on multiple accelerator cores

SC Kao, T Krishna - 2022 IEEE International Symposium on …, 2022 - ieeexplore.ieee.org

As Deep Learning continues to drive a variety of applications in edge and cloud data
centers, there is a growing trend towards building large accelerators with several sub …

Desa Cita Citat per 50 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology

B Hyun, T Kim, D Lee, M Rhu - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Processing-in-memory (PIM) has been explored for decades by computer architects, yet it
has never seen the light of day in real-world products due to its high design overheads and …

Desa Cita Citat per 20 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hercules: Heterogeneity-aware inference serving for at-scale personalized recommendation

L Ke, U Gupta, M Hempstead, CJ Wu… - … Symposium on High …, 2022 - ieeexplore.ieee.org

Personalized recommendation is an important class of deep-learning applications that
powers a large collection of internet services and consumes a considerable amount of …

Desa Cita Citat per 29 Articles relacionats Totes les 9 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Scalability Limitations of Processing-in-Memory using Real System Evaluations

G Jonatan, H Cho, H Son, X Wu, N Livesay… - Proceedings of the …, 2024 - dl.acm.org

Processing-in-memory (PIM), where the compute is moved closer to the memory or the data,
has been widely explored to accelerate emerging workloads. Recently, different PIM-based …

Desa Cita Citat per 6 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek

Accelerating ML recommendation with over a thousand RISC-V/tensor processors on Esperanto's ET-SoC-1 chip

D Ditzel, R Espasa, N Aymerich, A Baum… - 2021 IEEE Hot Chips …, 2021 - ieeexplore.ieee.org

The ET-SoC-1 has over a thousand RISC-V processors on a single TSMC 7nm chip,
including:• 1088 energy-efficient ET-Minion 64-bit RISC-V in-order cores each with a …

Desa Cita Citat per 29 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Special session: Towards an agile design methodology for efficient, reliable, and secure ML systems

S Dave, A Marchisio, MA Hanif… - 2022 IEEE 40th VLSI …, 2022 - ieeexplore.ieee.org

The real-world use cases of Machine Learning (ML) have exploded over the past few years.
However, the current computing infrastructure is insufficient to support all real-world …

Desa Cita Citat per 16 Articles relacionats Totes les 18 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The Landscape of Compute-near-memory and Compute-in-memory: A Research and Commercial Overview

AA Khan, JPC De Lima, H Farzaneh… - arxiv preprint arxiv …, 2024 - arxiv.org

In today's data-centric world, where data fuels numerous application domains, with machine
learning at the forefront, handling the enormous volume of data efficiently in terms of time …

Desa Cita Citat per 13 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

First-generation inference accelerator deployment at facebook

Dfx: A low-latency multi-fpga appliance for accelerating transformer-based text generation

{CXL-ANNS}:{Software-Hardware} collaborative memory disaggregation and computation for {Billion-Scale} approximate nearest neighbor search

Mtia: First generation silicon targeting meta's recommendation systems

Magma: An optimization framework for map** multiple dnns on multiple accelerator cores

Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology

Hercules: Heterogeneity-aware inference serving for at-scale personalized recommendation

Scalability Limitations of Processing-in-Memory using Real System Evaluations

Accelerating ML recommendation with over a thousand RISC-V/tensor processors on Esperanto's ET-SoC-1 chip

Special session: Towards an agile design methodology for efficient, reliable, and secure ML systems

The Landscape of Compute-near-memory and Compute-in-memory: A Research and Commercial Overview