Data-centric ai: Perspectives and challenges

D Zha, ZP Bhat, KH Lai, F Yang, X Hu - Proceedings of the 2023 SIAM …, 2023 - SIAM
The role of data in building AI systems has recently been significantly magnified by the
emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model …

Communication-efficient federated learning via knowledge distillation

C Wu, F Wu, L Lyu, Y Huang, X **e - Nature communications, 2022 - nature.com
Federated learning is a privacy-preserving machine learning technique to train intelligent
models from decentralized data, which enables exploiting private data by communicating …

Compute-efficient deep learning: Algorithmic trends and opportunities

BR Bartoldson, B Kailkhura, D Blalock - Journal of Machine Learning …, 2023 - jmlr.org
Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …

Monolith: real time recommendation system with collisionless embedding table

Z Liu, L Zou, X Zou, C Wang, B Zhang, D Tang… - arxiv preprint arxiv …, 2022 - arxiv.org
Building a scalable and real-time recommendation system is vital for many businesses
driven by time-sensitive customer feedback, such as short-videos ranking or online ads …

Dreamshard: Generalizable embedding table placement for recommender systems

D Zha, L Feng, Q Tan, Z Liu, KH Lai… - Advances in …, 2022 - proceedings.neurips.cc
We study embedding table placement for distributed recommender systems, which aims to
partition and place the tables on multiple hardware devices (eg, GPUs) to balance the …

Wukong: Towards a scaling law for large-scale recommendation

B Zhang, L Luo, Y Chen, J Nie, X Liu, D Guo… - arxiv preprint arxiv …, 2024 - arxiv.org
Scaling laws play an instrumental role in the sustainable improvement in model quality.
Unfortunately, recommendation models to date do not exhibit such laws similar to those …

{AdaEmbed}: Adaptive embedding for {Large-Scale} recommendation models

F Lai, W Zhang, R Liu, W Tsai, X Wei, Y Hu… - … USENIX Symposium on …, 2023 - usenix.org
Deep learning recommendation models (DLRMs) are using increasingly larger embedding
tables to represent categorical sparse features such as video genres. Each embedding row …

Tenrec: A large-scale multipurpose benchmark dataset for recommender systems

G Yuan, F Yuan, Y Li, B Kong, S Li… - Advances in …, 2022 - proceedings.neurips.cc
Existing benchmark datasets for recommender systems (RS) either are created at a small
scale or involve very limited forms of user feedback. RS models evaluated on such datasets …

Models are codes: Towards measuring malicious code poisoning attacks on pre-trained model hubs

J Zhao, S Wang, Y Zhao, X Hou, K Wang… - Proceedings of the 39th …, 2024 - dl.acm.org
The proliferation of pre-trained models (PTMs) and datasets has led to the emergence of
centralized model hubs like Hugging Face, which facilitate collaborative development and …

Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large Scale Recommendation

L Luo, B Zhang, M Tsang, Y Ma… - Proceedings of …, 2024 - proceedings.mlsys.org
We study a mismatch between the deep learning recommendation models' flat architecture,
common distributedtraining paradigm and hierarchical data center topology. To address the …