Heterogeneous acceleration pipeline for recommendation system training
Recommendation models rely on deep learning networks and large embedding tables,
resulting in computationally and memory-intensive processes. These models are typically …
resulting in computationally and memory-intensive processes. These models are typically …
Atom: An efficient query serving system for embedding-based knowledge graph reasoning with operator-level batching
Knowledge graph reasoning (KGR) answers logical queries over a knowledge graph (KG),
and embedding-based KGR (EKGR) becomes popular recently, which embeds both queries …
and embedding-based KGR (EKGR) becomes popular recently, which embeds both queries …
Updlrm: Accelerating personalized recommendation using real-world pim architecture
Deep Learning Recommendation Models (DLRMs) have gained popularity in
recommendation systems due to their effectiveness in handling large-scale recommendation …
recommendation systems due to their effectiveness in handling large-scale recommendation …
MemANNS: Enhancing Billion-Scale ANNS Efficiency with Practical PIM Hardware
In numerous production environments, Approximate Nearest Neighbor Search (ANNS) plays
an indispensable role, particularly when dealing with massive datasets that can contain …
an indispensable role, particularly when dealing with massive datasets that can contain …
Embedding Optimization for Training Large-scale Deep Learning Recommendation Systems with EMBark
S Liu, N Zheng, H Kang, X Simmons, J Zhang… - Proceedings of the 18th …, 2024 - dl.acm.org
Training large-scale deep learning recommendation models (DLRMs) with embedding
tables stretching across multiple GPUs in a cluster presents a unique challenge, demanding …
tables stretching across multiple GPUs in a cluster presents a unique challenge, demanding …
SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules
This paper documents our characterization study and practices for serving text-to-image
requests with stable diffusion models in production. We first comprehensively analyze …
requests with stable diffusion models in production. We first comprehensively analyze …
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs
Personalized recommendation is a ubiquitous appli-cation on the internet, with many
industries and hyperscalers extensively leveraging Deep Learning Recommendation …
industries and hyperscalers extensively leveraging Deep Learning Recommendation …
Exploiting Structured Feature and Runtime Isolation for High-Performant Recommendation Serving
Recommendation serving with deep learning models is one of the most valuable services of
modern E-commerce companies. In production, to accommodate billions of recommendation …
modern E-commerce companies. In production, to accommodate billions of recommendation …
Disaggregating Embedding Recommendation Systems with FlexEMR
Efficiently serving embedding-based recommendation (EMR) models remains a significant
challenge due to their increasingly large memory requirements. Today's practice splits the …
challenge due to their increasingly large memory requirements. Today's practice splits the …
Amplify Graph Learning for Recommendation via Sparsity Completion
P Yuan, H Li, M Fang, X Yu, Y Hao, J Du - arxiv preprint arxiv:2406.18984, 2024 - arxiv.org
Graph learning models have been widely deployed in collaborative filtering (CF) based
recommendation systems. Due to the issue of data sparsity, the graph structure of the …
recommendation systems. Due to the issue of data sparsity, the graph structure of the …