Learning linear regression models over factorized joins
We investigate the problem of building least squares regression models over training
datasets defined by arbitrary join queries on database tables. Our key observation is that …
datasets defined by arbitrary join queries on database tables. Our key observation is that …
Size bounds for factorised representations of query results
We study two succinct representation systems for relational data based on relational algebra
expressions with unions, Cartesian products, and singleton relations: f-representations …
expressions with unions, Cartesian products, and singleton relations: f-representations …
Learning generalized linear models over normalized data
Enterprise data analytics is a booming area in the data management industry. Many
companies are racing to develop toolkits that closely integrate statistical and machine …
companies are racing to develop toolkits that closely integrate statistical and machine …
[PDF][PDF] KÙZU graph database management system
Datasets and workloads of popular applications that use graph database management
systems (GDBMSs) require a set of storage and query processing features that RDBMSs do …
systems (GDBMSs) require a set of storage and query processing features that RDBMSs do …
The LDBC social network benchmark: Business intelligence workload
The Social Network Benchmark's Business Intelligence workload (SNB BI) is a
comprehensive graph OLAP benchmark targeting analytical data systems capable of …
comprehensive graph OLAP benchmark targeting analytical data systems capable of …
What Goes Around Comes Around... And Around...
M Stonebraker, A Pavlo - ACM Sigmod Record, 2024 - dl.acm.org
Two decades ago, one of us co-authored a paper commenting on the previous 40 years of
data modelling research and development [188]. That paper demonstrated that the relational …
data modelling research and development [188]. That paper demonstrated that the relational …
Factorized databases
This paper overviews factorized databases and their application to machine learning. The
key observation underlying this work is that state-of-the-art relational query processing …
key observation underlying this work is that state-of-the-art relational query processing …
Towards linear algebra over normalized data
Providing machine learning (ML) over relational data is a mainstream requirement for data
analytics systems. While almost all the ML tools require the input data to be presented as a …
analytics systems. While almost all the ML tools require the input data to be presented as a …
A layered aggregate engine for analytics workloads
This paper introduces LMFAO (Layered Multiple Functional Aggregate Optimization), an in-
memory optimization and execution engine for batches of aggregates over the input …
memory optimization and execution engine for batches of aggregates over the input …
Data provenance
B Glavic - Foundations and Trends® in Databases, 2021 - nowpublishers.com
Data provenance has evolved from a niche topic to a mainstream area of research in
databases and other research communities. This article gives a comprehensive introduction …
databases and other research communities. This article gives a comprehensive introduction …