Heterogeneous acceleration pipeline for recommendation system training

M Adnan, YE Maboud, D Mahajan… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
Recommendation models rely on deep learning networks and large embedding tables,
resulting in computationally and memory-intensive processes. These models are typically …

Atom: An efficient query serving system for embedding-based knowledge graph reasoning with operator-level batching

Q Zhou, P Yin, X Yan, C Li, G Jiang… - Proceedings of the ACM on …, 2024 - dl.acm.org
Knowledge graph reasoning (KGR) answers logical queries over a knowledge graph (KG),
and embedding-based KGR (EKGR) becomes popular recently, which embeds both queries …

Updlrm: Accelerating personalized recommendation using real-world pim architecture

S Chen, H Tan, AC Zhou, Y Li, P Balaji - … of the 61st ACM/IEEE Design …, 2024 - dl.acm.org
Deep Learning Recommendation Models (DLRMs) have gained popularity in
recommendation systems due to their effectiveness in handling large-scale recommendation …

MemANNS: Enhancing Billion-Scale ANNS Efficiency with Practical PIM Hardware

S Chen, AC Zhou, Y Shi, Y Li, X Yao - arxiv preprint arxiv:2410.23805, 2024 - arxiv.org
In numerous production environments, Approximate Nearest Neighbor Search (ANNS) plays
an indispensable role, particularly when dealing with massive datasets that can contain …

Embedding Optimization for Training Large-scale Deep Learning Recommendation Systems with EMBark

S Liu, N Zheng, H Kang, X Simmons, J Zhang… - Proceedings of the 18th …, 2024 - dl.acm.org
Training large-scale deep learning recommendation models (DLRMs) with embedding
tables stretching across multiple GPUs in a cluster presents a unique challenge, demanding …

SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules

S Li, L Yang, X Jiang, H Lu, Z Di, W Lu, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper documents our characterization study and practices for serving text-to-image
requests with stable diffusion models in production. We first comprehensively analyze …

Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs

R Jain, VM Bhasi, A Jog… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Personalized recommendation is a ubiquitous appli-cation on the internet, with many
industries and hyperscalers extensively leveraging Deep Learning Recommendation …

Exploiting Structured Feature and Runtime Isolation for High-Performant Recommendation Serving

X You, H Yang, S Wang, T Peng, C Ding… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Recommendation serving with deep learning models is one of the most valuable services of
modern E-commerce companies. In production, to accommodate billions of recommendation …

Disaggregating Embedding Recommendation Systems with FlexEMR

Y Huang, Z Yang, J **ng, Y Dai, Y Qiu, D Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Efficiently serving embedding-based recommendation (EMR) models remains a significant
challenge due to their increasingly large memory requirements. Today's practice splits the …

Amplify Graph Learning for Recommendation via Sparsity Completion

P Yuan, H Li, M Fang, X Yu, Y Hao, J Du - arxiv preprint arxiv:2406.18984, 2024 - arxiv.org
Graph learning models have been widely deployed in collaborative filtering (CF) based
recommendation systems. Due to the issue of data sparsity, the graph structure of the …