Learning linear regression models over factorized joins

M Schleich, D Olteanu, R Ciucanu - Proceedings of the 2016 …, 2016 - dl.acm.org
We investigate the problem of building least squares regression models over training
datasets defined by arbitrary join queries on database tables. Our key observation is that …

Size bounds for factorised representations of query results

D Olteanu, J Závodný - ACM Transactions on Database Systems (TODS …, 2015 - dl.acm.org
We study two succinct representation systems for relational data based on relational algebra
expressions with unions, Cartesian products, and singleton relations: f-representations …

Learning generalized linear models over normalized data

A Kumar, J Naughton, JM Patel - Proceedings of the 2015 ACM SIGMOD …, 2015 - dl.acm.org
Enterprise data analytics is a booming area in the data management industry. Many
companies are racing to develop toolkits that closely integrate statistical and machine …

[PDF][PDF] KÙZU graph database management system

X Feng, G **, Z Chen, C Liu, S Salihoğlu - CIDR, 2023 - cs.uwaterloo.ca
Datasets and workloads of popular applications that use graph database management
systems (GDBMSs) require a set of storage and query processing features that RDBMSs do …

The LDBC social network benchmark: Business intelligence workload

G Szárnyas, J Waudby, BA Steer, D Szakállas… - Proceedings of the …, 2022 - dl.acm.org
The Social Network Benchmark's Business Intelligence workload (SNB BI) is a
comprehensive graph OLAP benchmark targeting analytical data systems capable of …

What Goes Around Comes Around... And Around...

M Stonebraker, A Pavlo - ACM Sigmod Record, 2024 - dl.acm.org
Two decades ago, one of us co-authored a paper commenting on the previous 40 years of
data modelling research and development [188]. That paper demonstrated that the relational …

Factorized databases

D Olteanu, M Schleich - ACM SIGMOD Record, 2016 - dl.acm.org
This paper overviews factorized databases and their application to machine learning. The
key observation underlying this work is that state-of-the-art relational query processing …

Towards linear algebra over normalized data

L Chen, A Kumar, J Naughton, JM Patel - arxiv preprint arxiv:1612.07448, 2016 - arxiv.org
Providing machine learning (ML) over relational data is a mainstream requirement for data
analytics systems. While almost all the ML tools require the input data to be presented as a …

A layered aggregate engine for analytics workloads

M Schleich, D Olteanu, M Abo Khamis… - Proceedings of the …, 2019 - dl.acm.org
This paper introduces LMFAO (Layered Multiple Functional Aggregate Optimization), an in-
memory optimization and execution engine for batches of aggregates over the input …

Data provenance

B Glavic - Foundations and Trends® in Databases, 2021 - nowpublishers.com
Data provenance has evolved from a niche topic to a mainstream area of research in
databases and other research communities. This article gives a comprehensive introduction …