**nsight: explainable data analysis through the lens of causality

P Ma, R Ding, S Wang, S Han, D Zhang - … of the ACM on Management of …, 2023 - dl.acm.org
In light of the growing popularity of Exploratory Data Analysis (EDA), understanding the
underlying causes of the knowledge acquired by EDA is crucial. However, it remains under …

Workflow provenance in the lifecycle of scientific machine learning

R Souza, LG Azevedo, V Lourenço… - Concurrency and …, 2022 - Wiley Online Library
Abstract Machine learning (ML) has already fundamentally changed several businesses.
More recently, it has also been profoundly impacting the computational science and …

On explaining confounding bias

B Youngmann, M Cafarella… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org
When analyzing large datasets, analysts are often interested in the explanations for
unexpected results produced by their queries. In this work, we focus on aggregate SQL …

PROV-IO: An I/O-centric provenance framework for scientific data on HPC systems

R Han, S Byna, H Tang, B Dong, M Zheng - Proceedings of the 31st …, 2022 - dl.acm.org
cData provenance, or data lineage, describes the life cycle of data. In scientific workflows on
HPC systems, scientists often seek diverse provenance (eg, origins of data products, usage …

Putting things into context: Rich explanations for query answers using join graphs

C Li, Z Miao, Q Zeng, B Glavic, S Roy - Proceedings of the 2021 …, 2021 - dl.acm.org
In many data analysis applications there is a need to explain why a surprising or interesting
result was produced by a query. Previous approaches to explaining results have directly or …

PROV-IO: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems

R Han, M Zheng, S Byna, H Tang… - … on Parallel and …, 2024 - ieeexplore.ieee.org
Data provenance, or data lineage, describes the life cycle of data. In scientific workflows on
HPC systems, scientists often seek diverse provenance (eg, origins of data products, usage …

[PDF][PDF] Trends in explanations: Understanding and debugging data-driven systems

B Glavic, A Meliou, S Roy - Foundations and Trends® in Databases, 2021 - par.nsf.gov
Humans reason about the world around them by seeking to understand why and how
something occurs. The same principle extends to the technology that so many of human …

Banzhaf Values for Facts in Query Answering

O Abramovich, D Deutch, N Frost, A Kara… - Proceedings of the ACM …, 2024 - dl.acm.org
Quantifying the contribution of database facts to query answers has been studied as means
of explanation. The Banzhaf value, originally developed in Game Theory, is a natural …

[HTML][HTML] Validity constraints for data analysis workflows

F Schintke, K Belhajjame, N De Mecquenem… - Future Generation …, 2024 - Elsevier
Porting a scientific data analysis workflow (DAW) to a cluster infrastructure, a new software
stack, or even only a new dataset with some notably different properties is often challenging …

Fedex: An explainability framework for data exploration steps

D Deutch, A Gilad, T Milo, A Mualem… - arxiv preprint arxiv …, 2022 - arxiv.org
When exploring a new dataset, Data Scientists often apply analysis queries, look for insights
in the resulting dataframe, and repeat to apply further queries. We propose in this paper a …