A survey on automatic parameter tuning for big data processing systems
Big data processing systems (eg, Hadoop, Spark, Storm) contain a vast number of
configuration parameters controlling parallelism, I/O behavior, memory settings, and …
configuration parameters controlling parallelism, I/O behavior, memory settings, and …
MapReduce scheduling algorithms in Hadoop: a systematic study
Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses
Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process …
Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process …
Big data analysis-based secure cluster management for optimized control plane in software-defined networks
In software-defined networks (SDNs), the abstracted control plane is its symbolic
characteristic, whose core component is the software-based controller. The control plane is …
characteristic, whose core component is the software-based controller. The control plane is …
IoTDeM: An IoT Big Data-oriented MapReduce performance prediction extended model in multiple edge clouds
Abstract Uploading all IoT Big Data to a centralized cloud for data analytics is infeasible
because of the excessive latency and bandwidth limitation of the Internet. A promising …
because of the excessive latency and bandwidth limitation of the Internet. A promising …
Predictive performance modeling for distributed batch processing using black box monitoring and machine learning
In many domains, the previous decade was characterized by increasing data volumes and
growing complexity of data analyses, creating new demands for batch processing on …
growing complexity of data analyses, creating new demands for batch processing on …
Towards analyzing the performance of hybrid edge-cloud processing
D Loghin, L Ramapantulu… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
While edge computing is gaining traction, organizations operating in geographically
distributed locations are still using cloud computing to collect and post-process data. In this …
distributed locations are still using cloud computing to collect and post-process data. In this …
Transition phase classification and prediction
J Lau, S Schoenmackers… - … Symposium on High …, 2005 - ieeexplore.ieee.org
Most programs are repetitive, where similar behavior can be seen at different execution
times. Proposed on-line systems automatically group these similar intervals of execution into …
times. Proposed on-line systems automatically group these similar intervals of execution into …
Task scheduling in big data platforms: a systematic literature review
Abstract Context: Hadoop, Spark, Storm, and Mesos are very well known frameworks in both
research and industrial communities that allow expressing and processing distributed …
research and industrial communities that allow expressing and processing distributed …
A dynamic and failure-aware task scheduling framework for hadoop
Hadoop has become a popular framework for processing data-intensive applications in
cloud environments. A core constituent of Hadoop is the scheduler, which is responsible for …
cloud environments. A core constituent of Hadoop is the scheduler, which is responsible for …
Autotoken: Predicting peak parallelism for big data analytics at microsoft
Right-sizing resource allocation for big-data queries, particularly in serverless environments,
is critical for improving infrastructure operational efficiency, capacity availability, query …
is critical for improving infrastructure operational efficiency, capacity availability, query …