Optimization of complex dataflows with user-defined functions

A Rheinländer, U Leser, G Graefe - ACM Computing Surveys (CSUR), 2017 - dl.acm.org
In many fields, recent years have brought a sharp rise in the size of the data to be analyzed
and the complexity of the analysis to be performed. Such analyses are often described as …

Generalized {Sub-Query} Fusion for Eliminating Redundant {I/O} from {Big-Data} Queries

P Sarthi, K Rajan, A Lal, A Modi, P Jain, M Liu… - … USENIX Symposium on …, 2020 - usenix.org
SQL is the de-facto language for big-data analytics. Despite the cost of distributed SQL
execution being dominated by disk and network I/O, we find that state-of-the-art optimizers …

Scalable and Declarative Information Extraction in a Parallel Data Analytics System

A Rheinländer - 2017 - edoc.hu-berlin.de
Information extraction (IE) on very large data sets requires highly complex, scalable, and
adaptive systems. Although numerous IE algorithms exist, their seamless and extensible …