Naiad: a timely dataflow system

DG Murray, F McSherry, R Isaacs, M Isard… - Proceedings of the …, 2013 - dl.acm.org
Naiad is a distributed system for executing data parallel, cyclic dataflow programs. It offers
the high throughput of batch processors, the low latency of stream processors, and the ability …

Discretized streams: Fault-tolerant streaming computation at scale

M Zaharia, T Das, H Li, T Hunter, S Shenker… - Proceedings of the …, 2013 - dl.acm.org
Many" big data" applications must act on data in real time. Running these applications at
ever-larger scales requires parallel platforms that automatically handle faults and stragglers …

The stratosphere platform for big data analytics

A Alexandrov, R Bergmann, S Ewen, JC Freytag… - The VLDB Journal, 2014 - Springer
We present Stratosphere, an open-source software stack for parallel data analysis.
Stratosphere brings together a unique set of features that allow the expressive, easy, and …

Structured streaming: A declarative api for real-time applications in apache spark

M Armbrust, T Das, J Torres, B Yavuz, S Zhu… - Proceedings of the …, 2018 - dl.acm.org
With the ubiquity of real-time data, organizations need streaming systems that are scalable,
easy to use, and easy to integrate into business applications. Structured Streaming is a new …

[KÖNYV][B] Designing data-intensive applications: The big ideas behind reliable, scalable, and maintainable systems

M Kleppmann - 2017 - books.google.com
Data is at the center of many challenges in system design today. Difficult issues need to be
figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In …

Discretized streams: an efficient and {Fault-Tolerant} model for stream processing on large clusters

M Zaharia, T Das, H Li, S Shenker, I Stoica - 4th USENIX Workshop on …, 2012 - usenix.org
Many important “big data” applications need to process data arriving in real time. However,
current programming models for distributed stream processing are relatively low-level, often …

From" think like a vertex" to" think like a graph"

Y Tian, A Balmin, SA Corsten, S Tatikonda… - Proceedings of the …, 2013 - dl.acm.org
To meet the challenge of processing rapidly growing graph and network data created by
modern applications, a number of distributed graph processing systems have emerged …

Big graphs: challenges and opportunities

W Fan - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
Big data is typically characterized with 4V's: Volume, Velocity, Variety and Veracity. When it
comes to big graphs, these challenges become even more staggering. Each and every of …

A survey of large-scale analytical query processing in MapReduce

C Doulkeridis, K Nørvåg - The VLDB journal, 2014 - Springer
Enterprises today acquire vast volumes of data from different sources and leverage this
information by means of data analysis to support effective decision-making and provide new …

Hedgecut: Maintaining randomised trees for low-latency machine unlearning

S Schelter, S Grafberger, T Dunning - Proceedings of the 2021 …, 2021 - dl.acm.org
Software systems that learn from user data with machine learning (ML) have become
ubiquitous over the last years. Recent law such as the" General Data Protection …