The family of mapreduce and large-scale data processing systems
In the last two decades, the continuous increase of computational power has produced an
overwhelming flow of data which has called for a paradigm shift in the computing …
overwhelming flow of data which has called for a paradigm shift in the computing …
Parallel processing systems for big data: a survey
The volume, variety, and velocity properties of big data and the valuable information it
contains have motivated the investigation of many new parallel data processing systems in …
contains have motivated the investigation of many new parallel data processing systems in …
The stratosphere platform for big data analytics
We present Stratosphere, an open-source software stack for parallel data analysis.
Stratosphere brings together a unique set of features that allow the expressive, easy, and …
Stratosphere brings together a unique set of features that allow the expressive, easy, and …
Elastic scaling for data stream processing
This article addresses the profitability problem associated with auto-parallelization of
general-purpose distributed data stream processing applications. Auto-parallelization …
general-purpose distributed data stream processing applications. Auto-parallelization …
Auto-scaling techniques for elastic data stream processing
Typical use cases like financial trading or monitoring of manufacturing equipment pose huge
challenges regarding end to end latency as well as throughput towards existing data stream …
challenges regarding end to end latency as well as throughput towards existing data stream …
The DEBS 2012 grand challenge
The goal of the DEBS Grand Challenge series is to contribute to the Event Processing Grand
Challenge, that serves as a common goal and mechanism for coordinating research …
Challenge, that serves as a common goal and mechanism for coordinating research …
Pricing approaches for data markets
Currently, multiple data vendors utilize the cloud-computing paradigm for trading raw data,
associated analytical services, and analytic results as a commodity good. We observe that …
associated analytical services, and analytic results as a commodity good. We observe that …
From conceptual design to performance optimization of ETL workflows: current state of research and open problems
In this paper, we discuss the state of the art and current trends in designing and optimizing
ETL workflows. We explain the existing techniques for:(1) constructing a conceptual and a …
ETL workflows. We explain the existing techniques for:(1) constructing a conceptual and a …
A survey on vertical and horizontal scaling platforms for big data analytics
There is no doubt that we are entering the era of big data. The challenge is on how to store,
search, and analyze the huge amount of data that is being generated per second. One of the …
search, and analyze the huge amount of data that is being generated per second. One of the …
Big data 2.0 processing systems: Taxonomy and open challenges
Data is key resource in the modern world. Big data has become a popular term which is
used to describe the exponential growth and availability of data. In practice, the growing …
used to describe the exponential growth and availability of data. In practice, the growing …