Data preparation: A technological perspective and review

AAA Fernandes, M Koehler, N Konstantinou… - SN Computer …, 2023 - Springer
Data analysis often uses data sets that were collected for different purposes. Indeed, new
insights are often obtained by combining data sets that were produced independently of …

Explaining dataset changes for semantic data versioning with explain-da-v

R Shraga, RJ Miller - Proceedings of the VLDB Endowment, 2023 - par.nsf.gov
In multi-user environments in which data science and analysis is collaborative, multiple
versions of the same datasets are generated. While managing and storing data versions has …

A large reproducible benchmark on text classification for the legal domain based on the ECHR-OD repository

A Quemy, R Wrembel, N Łopuszyńska, G Papadakis… - Information Systems, 2023 - Elsevier
This work is a companion reproducible paper of our experiments and results reported in a
previous work Quemy and Wrembel (2022) introducing an open repository of legal …

VADA: an architecture for end user informed data preparation

N Konstantinou, E Abel, L Bellomarini, A Bogatu… - Journal of Big Data, 2019 - Springer
Background Data scientists spend considerable amounts of time preparing data for analysis.
Data preparation is labour intensive because the data scientist typically takes fine grained …

Advances on data management and information systems

J Darmont, B Novikov, R Wrembel… - Information Systems …, 2022 - Springer
The research and technological area of data management encompasses various concepts,
techniques, algorithms and technologies, including data modeling, data integration and …

An Evaluation Framework for Machine Learning and Data Science (ML/DS) Based Financial Strategies: A Case Study Driven Decision Model

M Saadatmand, T Daim, C Mena… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Big data and computational technologies are increasingly important worldwide in asset and
investment management. Many investment management firms are adopting these data …

A hierarchical decision model for evaluating the Strategy Readiness of Quantitative Machine Learning/Data science-driven investment strategies

M Saadatmand - 2024 - search.proquest.com
Big data and computational technologies are increasingly important worldwide in asset and
investment management. Many investment management firms are adopting these data …

Boosting Methods Comparison: XGBoost, Adaboost and Gradient Boosting for Business Partner Performance Prediction

E Warni, DW Saputri, AAP Alimuddin… - 2024 8th …, 2024 - ieeexplore.ieee.org
Measuring the performance of business partners is a crucial technique for overseeing and
sustaining an organization or company's competitive edge. This process involves …

Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V (Technical Report)

R Shraga, RJ Miller - arxiv preprint arxiv:2301.13095, 2023 - arxiv.org
In multi-user environments in which data science and analysis is collaborative, multiple
versions of the same datasets are generated. While managing and storing data versions has …

[PDF][PDF] SynthEdit: Format transformations by example using edit operations.

A Bogatu, AAA Fernandes, NW Paton, N Konstantinou - EDBT, 2019 - academia.edu
Format transformation is one of the most labor intensive tasks of a data wrangling process.
Recent advances in programming by example proposed synthesis algorithms that showed …