Sustainable ai: Environmental implications, challenges and opportunities
This paper explores the environmental impact of the super-linear growth trends for AI from a
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …
Communication-efficient distributed deep learning: A comprehensive survey
Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …
Lira: Learnable, imperceptible and robust backdoor attacks
Recently, machine learning models have demonstrated to be vulnerable to backdoor
attacks, primarily due to the lack of transparency in black-box models such as deep neural …
attacks, primarily due to the lack of transparency in black-box models such as deep neural …
Zero-infinity: Breaking the gpu memory wall for extreme scale deep learning
In the last three years, the largest dense deep learning models have grown over 1000x to
reach hundreds of billions of parameters, while the GPU memory has only grown by 5x (16 …
reach hundreds of billions of parameters, while the GPU memory has only grown by 5x (16 …
RecSSD: near data processing for solid state drive based recommendation inference
Neural personalized recommendation models are used across a wide variety of datacenter
applications including search, social media, and entertainment. State-of-the-art models …
applications including search, social media, and entertainment. State-of-the-art models …
Understanding training efficiency of deep learning recommendation models at scale
The use of GPUs has proliferated for machine learning workflows and is now considered
mainstream for many deep learning models. Meanwhile, when training state-of-the-art …
mainstream for many deep learning models. Meanwhile, when training state-of-the-art …
RecShard: statistical feature-based memory optimization for industry-scale neural recommendation
We propose RecShard, a fine-grained embedding table (EMB) partitioning and placement
technique for deep learning recommendation models (DLRMs). RecShard is designed …
technique for deep learning recommendation models (DLRMs). RecShard is designed …
A comprehensive survey on trustworthy recommender systems
As one of the most successful AI-powered applications, recommender systems aim to help
people make appropriate decisions in an effective and efficient way, by providing …
people make appropriate decisions in an effective and efficient way, by providing …
Dreamshard: Generalizable embedding table placement for recommender systems
We study embedding table placement for distributed recommender systems, which aims to
partition and place the tables on multiple hardware devices (eg, GPUs) to balance the …
partition and place the tables on multiple hardware devices (eg, GPUs) to balance the …
HET: scaling out huge embedding model training via cache-enabled distributed framework
Embedding models have been an effective learning paradigm for high-dimensional data.
However, one open issue of embedding models is that their representations (latent factors) …
However, one open issue of embedding models is that their representations (latent factors) …