Smoke: Fine-grained lineage at interactive speed
Data lineage describes the relationship between individual input and output data items of a
workflow, and has served as an integral ingredient for both traditional (eg, debugging …
workflow, and has served as an integral ingredient for both traditional (eg, debugging …
Capturing and querying fine-grained provenance of preprocessing pipelines in data science
Data processing pipelines that are designed to clean, transform and alter data in preparation
for learning predictive models, have an impact on those models' accuracy and performance …
for learning predictive models, have an impact on those models' accuracy and performance …
Data provenance
B Glavic - Foundations and Trends® in Databases, 2021 - nowpublishers.com
Data provenance has evolved from a niche topic to a mainstream area of research in
databases and other research communities. This article gives a comprehensive introduction …
databases and other research communities. This article gives a comprehensive introduction …
Bao: Learning to steer query optimizers
Query optimization remains one of the most challenging problems in data management
systems. Recent efforts to apply machine learning techniques to query optimization …
systems. Recent efforts to apply machine learning techniques to query optimization …
Supporting Better Insights of Data Science Pipelines with Fine-grained Provenance
Successful data-driven science requires complex data engineering pipelines to clean,
transform, and alter data in preparation for machine learning, and robust results can only be …
transform, and alter data in preparation for machine learning, and robust results can only be …
GProM-a swiss army knife for your provenance needs
We present an overview of GProM, a generic provenance middleware for relational
databases. The sys-tem supports diverse provenance and annotation management tasks …
databases. The sys-tem supports diverse provenance and annotation management tasks …
PUG: a framework and practical implementation for why and why-not provenance
Explaining why an answer is (or is not) returned by a query is important for many
applications including auditing, debugging data and queries, and answering hypothetical …
applications including auditing, debugging data and queries, and answering hypothetical …
In-memory blockchain: Toward efficient and trustworthy data provenance for hpc systems
The state-of-the-art approaches for tracking data provenance on high-performance
computing (HPC) systems are either supported by file systems or relational databases …
computing (HPC) systems are either supported by file systems or relational databases …
You Say'What', I Hear'Where'and'Why':(Mis-) Interpreting SQL to Derive Fine-Grained Provenance
SQL declaratively specifies what the desired output of a query is. This work shows that a non-
standard interpretation of the SQL semantics can, instead, disclose where a piece of the …
standard interpretation of the SQL semantics can, instead, disclose where a piece of the …
Toward accurate and efficient emulation of public blockchains in the cloud
Blockchain is an enabler of many emerging decentralized applications in areas of
cryptocurrency, Internet of Things, smart healthcare, among many others. Although various …
cryptocurrency, Internet of Things, smart healthcare, among many others. Although various …