tf. data: A machine learning data processing framework

DG Murray, J Simsa, A Klimovic, I Indyk - arxiv preprint arxiv:2101.12127, 2021 - arxiv.org
Training machine learning models requires feeding input data for models to ingest. Input
pipelines for machine learning jobs are often challenging to implement efficiently as they …

End-to-end optimization of machine learning prediction queries

K Park, K Saur, D Banda, R Sen, M Interlandi… - Proceedings of the …, 2022 - dl.acm.org
Prediction queries are widely used across industries to perform advanced analytics and
draw insights from data. They include a data processing part (eg, for joining, filtering …

Production machine learning pipelines: Empirical analysis and optimization opportunities

D **n, H Miao, A Parameswaran… - Proceedings of the 2021 …, 2021 - dl.acm.org
Machine learning (ML) is now commonplace, powering data-driven applications in various
organizations. Unlike the traditional perception of ML in research, ML production pipelines …

A tensor compiler for unified machine learning prediction serving

S Nakandala, K Saur, GI Yu, K Karanasos… - … USENIX Symposium on …, 2020 - usenix.org
Machine Learning (ML) adoption in the enterprise requires simpler and more efficient
software infrastructure—the bespoke solutions typical in large web companies are simply …

Designing an open framework for query optimization and compilation

M Jungmair, A Kohn, J Giceva - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
Since its invention, data-centric code generation has been adopted for query compilation by
various database systems in academia and industry. These database systems are fast but …

Query processing on tensor computation runtimes

D He, S Nakandala, D Banda, R Sen, K Saur… - arxiv preprint arxiv …, 2022 - arxiv.org
The huge demand for computation in artificial intelligence (AI) is driving unparalleled
investments in hardware and software systems for AI. This leads to an explosion in the …

Distributed deep learning on data systems: a comparative analysis of approaches

Y Zhang, F Mcquillan, N Jayaram, N Kak… - Proceedings of the …, 2021 - par.nsf.gov
Deep learning (DL) is growing in popularity for many data analytics applications, including
among enterprises. Large business-critical datasets in such settings typically reside in …

Efficient execution of user-defined functions in SQL queries

Y Foufoulas, A Simitsis - Proceedings of the VLDB Endowment, 2023 - dl.acm.org
User-defined functions (UDFs) have been widely used to overcome the expressivity
limitations of SQL and complement its declarative nature with functional capabilities. UDFs …

Babelfish: Efficient execution of polyglot queries

PM Grulich, S Zeuch, V Markl - Proceedings of the VLDB Endowment, 2021 - dl.acm.org
Today's users of data processing systems come from different domains, have different levels
of expertise, and prefer different programming languages. As a result, analytical workload …

Data science through the looking glass: Analysis of millions of github notebooks and ml. net pipelines

F Psallidas, Y Zhu, B Karlas, J Henkel… - ACM SIGMOD …, 2022 - dl.acm.org
The recent success of machine learning (ML) has led to an explosive growth of systems and
applications built by an ever-growing community of system builders and data science (DS) …