A comprehensive survey on trustworthy recommender systems
As one of the most successful AI-powered applications, recommender systems aim to help
people make appropriate decisions in an effective and efficient way, by providing …
people make appropriate decisions in an effective and efficient way, by providing …
Splitwise: Efficient generative llm inference using phase splitting
Generative large language model (LLM) applications are growing rapidly, leading to large-
scale deployments of expensive and power-hungry GPUs. Our characterization of LLM …
scale deployments of expensive and power-hungry GPUs. Our characterization of LLM …
Deep learning workload scheduling in gpu datacenters: A survey
Deep learning (DL) has demonstrated its remarkable success in a wide variety of fields. The
development of a DL model is a time-consuming and resource-intensive procedure. Hence …
development of a DL model is a time-consuming and resource-intensive procedure. Hence …
Spatten: Efficient sparse attention architecture with cascade token and head pruning
The attention mechanism is becoming increasingly popular in Natural Language Processing
(NLP) applications, showing superior performance than convolutional and recurrent …
(NLP) applications, showing superior performance than convolutional and recurrent …
Chasing carbon: The elusive environmental footprint of computing
Given recent algorithm, software, and hardware innovation, computing has enabled a
plethora of new applications. As computing becomes increasingly ubiquitous, however, so …
plethora of new applications. As computing becomes increasingly ubiquitous, however, so …
Hardware architecture and software stack for PIM based on commercial DRAM technology: Industrial product
Emerging applications such as deep neural network demand high off-chip memory
bandwidth. However, under stringent physical constraints of chip packages and system …
bandwidth. However, under stringent physical constraints of chip packages and system …
Elliot: A comprehensive and rigorous framework for reproducible recommender systems evaluation
Recommender Systems have shown to be an effective way to alleviate the over-choice
problem and provide accurate and tailored recommendations. However, the impressive …
problem and provide accurate and tailored recommendations. However, the impressive …
Faa $ T: A transparent auto-scaling cache for serverless applications
Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy
their applications without the burden of managing the underlying infrastructure. However …
their applications without the burden of managing the underlying infrastructure. However …
Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks
Deep Neural Networks (DNNs) have reinvigorated real-world applications that rely on
learning patterns of data and are permeating into different industries and markets. Cloud …
learning patterns of data and are permeating into different industries and markets. Cloud …
RecSSD: near data processing for solid state drive based recommendation inference
Neural personalized recommendation models are used across a wide variety of datacenter
applications including search, social media, and entertainment. State-of-the-art models …
applications including search, social media, and entertainment. State-of-the-art models …