[HTML][HTML] Data provenance for cloud forensic investigations, security, challenges, solutions and future perspectives: A survey
It is extremely difficult to track down the original source of sensitive data from a variety of
sources in the cloud during transit and processing. For instance, data provenance, which …
sources in the cloud during transit and processing. For instance, data provenance, which …
Workflow provenance in the lifecycle of scientific machine learning
Abstract Machine learning (ML) has already fundamentally changed several businesses.
More recently, it has also been profoundly impacting the computational science and …
More recently, it has also been profoundly impacting the computational science and …
[HTML][HTML] Dfanalyzer: runtime dataflow analysis tool for computational science and engineering applications
DfAnalyzer is a tool for monitoring, debugging, and analyzing dataflows generated by
Computational Science and Engineering (CSE) applications. It collects strategic raw data …
Computational Science and Engineering (CSE) applications. It collects strategic raw data …
Raw data queries during data-intensive parallel workflow execution
Computer simulations consume and produce huge amounts of raw data files presented in
different formats, eg, HDF5 in computational fluid dynamics simulations. Users often need to …
different formats, eg, HDF5 in computational fluid dynamics simulations. Users often need to …
Towards optimizing the execution of spark scientific workflows using machine learning‐based parameter tuning
In the last few years, Apache Spark has become a de facto the standard framework for big
data systems on both industry and academy projects. Spark is used to execute compute‐and …
data systems on both industry and academy projects. Spark is used to execute compute‐and …
Interactive data exploration of distributed raw files: A systematic map** study
When exploring big amounts of data without a clear target, providing an interactive
experience becomes really difficult, since this tentative inspection usually defeats any early …
experience becomes really difficult, since this tentative inspection usually defeats any early …
Data reduction in scientific workflows using provenance monitoring and user steering
Scientific workflows need to be iteratively, and often interactively, executed for large input
datasets. Reducing data from input datasets is a powerful way to reduce overall execution …
datasets. Reducing data from input datasets is a powerful way to reduce overall execution …
Position Paper on Dataset Engineering to Accelerate Science
Data is a critical element in any discovery process. In the last decades, we observed
exponential growth in the volume of available data and the technology to manipulate it …
exponential growth in the volume of available data and the technology to manipulate it …
PresQ: Discovery of Multidimensional Equally-Distributed Dependencies via Quasi-Cliques on Hypergraphs
A Álvarez-Ayllón, M Palomo-Duarte… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Cross-matching data stored on separate files is an everyday activity in the scientific domain.
However, sometimes the relation between attributes may not be obvious. The discovery of …
However, sometimes the relation between attributes may not be obvious. The discovery of …
[PDF][PDF] In situ data steering on sedimentation simulation with provenance data
(AMR) are optimal strategies for tackling large-scale simulations. libMesh is an open-source
finite-element library that supports parallel AMR and is used in multiphysics applications. In …
finite-element library that supports parallel AMR and is used in multiphysics applications. In …