- Academic Search

H Herodotou, Y Chen, J Lu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org

Big data processing systems (eg, Hadoop, Spark, Storm) contain a vast number of
configuration parameters controlling parallelism, I/O behavior, memory settings, and …

Save Cite Cited by 115 Related articles All 10 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] usc.edu

Datasize-aware high dimensional configurations auto-tuning of in-memory cluster computing

Z Yu, Z Bei, X Qian - Proceedings of the Twenty-Third International …, 2018 - dl.acm.org

In-Memory cluster Computing (IMC) frameworks (eg, Spark) have become increasingly
important because they typically achieve more than 10× speedups over the traditional On …

Save Cite Cited by 119 Related articles All 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

Mronline: Mapreduce online performance tuning

M Li, L Zeng, S Meng, J Tan, L Zhang, AR Butt… - Proceedings of the 23rd …, 2014 - dl.acm.org

MapReduce job parameter tuning is a daunting and time consuming task. The parameter
configuration space is huge; there are more than 70 parameters that impact job …

Save Cite Cited by 193 Related articles All 8 versions Free GPT-4 DeepSeek

RFHOC: A random-forest approach to auto-tuning hadoop's configuration

Z Bei, Z Yu, H Zhang, W **ong, C Xu… - … on Parallel and …, 2015 - ieeexplore.ieee.org

Hadoop is a widely-used implementation framework of the MapReduce programming model
for large-scale data processing. Hadoop performance however is significantly affected by …

Save Cite Cited by 114 Related articles All 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] vt.edu

Memtune: Dynamic memory management for in-memory data analytic platforms

L Xu, M Li, L Zhang, AR Butt, Y Wang… - 2016 IEEE international …, 2016 - ieeexplore.ieee.org

Memory is a crucial resource for big data processing frameworks such as Spark and M3R,
where the memory is used both for computation and for caching intermediate storage data …

Save Cite Cited by 97 Related articles All 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] kaust.edu.sa

Towards automatic parameter tuning of stream processing systems

M Bilal, M Canini - Proceedings of the 2017 Symposium on Cloud …, 2017 - dl.acm.org

Optimizing the performance of big-data streaming applications has become a daunting and
time-consuming task: parameters may be tuned from a space of hundreds or even …

Save Cite Cited by 76 Related articles All 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

To tune or not to tune? in search of optimal configurations for data analytics

A Fekry, L Carata, T Pasquier, A Rice… - Proceedings of the 26th …, 2020 - dl.acm.org

This experimental study presents a number of issues that pose a challenge for practical
configuration tuning and its deployment in data analytics frameworks. These issues include …

Save Cite Cited by 58 Related articles All 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Rafiki: A middleware for parameter tuning of nosql datastores for dynamic metagenomics workloads

A Mahgoub, P Wood, S Ganesh, S Mitra… - Proceedings of the 18th …, 2017 - dl.acm.org

High performance computing (HPC) applications, such as metagenomics and other big data
systems, need to store and analyze huge volumes of semi-structured data. Such …

Save Cite Cited by 73 Related articles All 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Kea: Tuning an exabyte-scale data infrastructure

Y Zhu, S Krishnan, K Karanasos, I Tarte… - Proceedings of the …, 2021 - dl.acm.org

Microsoft's internal big-data infrastructure is one of the largest in the world---with over 300k
machines running billions of tasks from over 0.6 M daily jobs. Operating this infrastructure is …

Save Cite Cited by 21 Related articles All 3 versions Free GPT-4 DeepSeek

Gml: efficiently auto-tuning flink's configurations via guided machine learning

Y Guo, H Shan, S Huang, K Hwang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

The increasingly popular fused batch-streaming big data framework, Apache Flink, has
many performance-critical as well as untamed configuration parameters. However, how to …

Save Cite Cited by 25 Related articles All 3 versions Free GPT-4 DeepSeek

Create alert

Cite

Advanced search

Saved to My library

Gunther: Search-based auto-tuning of mapreduce

A survey on automatic parameter tuning for big data processing systems

Datasize-aware high dimensional configurations auto-tuning of in-memory cluster computing

Mronline: Mapreduce online performance tuning

RFHOC: A random-forest approach to auto-tuning hadoop's configuration

Memtune: Dynamic memory management for in-memory data analytic platforms

Towards automatic parameter tuning of stream processing systems

To tune or not to tune? in search of optimal configurations for data analytics

Rafiki: A middleware for parameter tuning of nosql datastores for dynamic metagenomics workloads

Kea: Tuning an exabyte-scale data infrastructure

Gml: efficiently auto-tuning flink's configurations via guided machine learning