Big data provenance: Challenges and implications for benchmarking

B Glavic - Workshop on Big Data Benchmarks, 2012 - Springer
Data Provenance is information about the origin and creation process of data. Such
information is useful for debugging data and transformations, auditing, evaluating the quality …

Improving the usefulness of research data with better paradata

I Huvila - Open Information Science, 2022 - degruyter.com
Considerable investments have been made in Europe and worldwide for develo**
research data infrastructures. Instead of a general lack of data about data, it has become …

Data provenance collection and security in a distributed environment: a survey

W Ametepe, C Wang, SK Ocansey, X Li… - International Journal of …, 2021 - Taylor & Francis
Kee** track of lifecycle history and information origins are important because that is the
only key issue to confirm information probative value and integrity. Provenance information …

Efficient scheduling of scientific workflows using hot metadata in a multisite cloud

J Liu, L Pineda, E Pacitti, A Costan… - … on Knowledge and …, 2018 - ieeexplore.ieee.org
Large-scale, data-intensive scientific applications are often expressed as scientific
workflows (SWfs). In this paper, we consider the problem of efficient scheduling of a large …

Knowledge retrieval (kr)

Y Yao, Y Zeng, N Zhong… - IEEE/WIC/ACM …, 2007 - ieeexplore.ieee.org
With the ever-increasing growth of data and information, finding the right knowledge
becomes a real challenge and an urgent task. Traditional data and information retrieval …

Data usage control for distributed systems

F Kelbert, A Pretschner - ACM Transactions on Privacy and Security …, 2018 - dl.acm.org
Data usage control enables data owners to enforce policies over how their data may be
used after they have been released and accessed. We address distributed aspects of this …

LDV: Light-weight database virtualization

Q Pham, T Malik, B Glavic… - 2015 IEEE 31st …, 2015 - ieeexplore.ieee.org
We present a light-weight database virtualization (LDV) system that allows users to share
and re-execute applications that operate on a relational database (DB). Previous methods …

Sketching distributed data provenance

T Malik, A Gehani, D Tariq, F Zaffar - Data Provenance and Data …, 2013 - Springer
Users can determine the precise origins of their data by collecting detailed provenance
records. However, auditing at a finer grain produces large amounts of metadata. To …

Interactive virtual expert system for advising (InVEStA)

D Pokrajac, M Rasamny - Proceedings. Frontiers in Education …, 2006 - ieeexplore.ieee.org
We propose and develop InVEStA-interactive virtual expert system for advising-to assist
undergraduate students and their advisors in providing timely, accurate and conflict-free …

Securing ultra-high-bandwidth science DMZ networks with coordinated situational awareness

V Nagendra, V Yegneswaran, P Porras - … of the 16th ACM Workshop on …, 2017 - dl.acm.org
The Science DMZ (SDMZ) is a special purpose network infrastructure that is engineered to
cater to the ultra-high bandwidth needs of the scientific and high performance computing …