Setting up a big data project: Challenges, opportunities, technologies and optimization

RV Zicari, M Rosselli, T Ivanov, N Korfiatis… - Big Data optimization …, 2016 - Springer
In the first part of this chapter we illustrate how a big data project can be set up and
optimized. We explain the general value of big data analytics for the enterprise and how …

Big data processing on single board computer clusters: Exploring challenges and possibilities

E Lee, H Oh, D Park - IEEE access, 2021 - ieeexplore.ieee.org
For more than a decade,“big data” has been an industry and academia buzz phrase. Over
this time, many companies adopted Apache Hadoop and Spark frameworks for their …

On energy efficiency and performance evaluation of single board computer based clusters: A hadoop case study

B Qureshi, A Koubâa - Electronics, 2019 - mdpi.com
Energy efficiency in a data center is a challenge and has garnered researchers interest. In
this study, we addressed the energy efficiency issue of a small scale data center by utilizing …

An experience report on building a big data analytics framework using Cloudera CDH and RapidMiner Radoop with a cluster of commodity computers

S Kunnakorntammanop, N Thepwuttisathaphon… - Soft Computing in Data …, 2019 - Springer
Many real-world data are not only large in volume but also heterogeneous and fast
generated. This type of data, known as big data, typically cannot be analyzed by using …

Deadline-aware preemptive job scheduling in hadoop yarn clusters

Y Gao, K Zhang - 2022 IEEE 25th International Conference on …, 2022 - ieeexplore.ieee.org
As a popular open-source framework for big data processing, Hadoop Yarn has been widely
used by large internet and e-commerce companies such as Amazon, Alibaba, and …

Sequence-to-sequence models for workload interference prediction on batch processing datacenters

D Buchaca, J Marcual, JLL Berral, D Carrera - Future Generation Computer …, 2020 - Elsevier
Co-scheduling of jobs in data centers is a challenging scenario where jobs can compete for
resources, leading to severe slowdowns or failed executions. Efficient job placement on …

Towards the prediction of the performance and energy efficiency of distributed data management systems

R Niemann - Companion Publication for ACM/SPEC on International …, 2016 - dl.acm.org
The ability to accurately simulate and predict the metrics (eg performance and energy
consumption) of data management systems offers several benefits. It can save investments …

On performance of commodity single board computer-based clusters: A big data perspective

B Qureshi, A Koubaa - … and Applications: Foundations for Smarter Cities …, 2020 - Springer
In recent times, the commodity Single Board Computers (SBCs) have now become
sufficiently powerful that they can run standard operating systems and mainstream …

Optimized durable commitlog for apache cassandra using capi-flash

B Sendir, M Govindaraju, R Odaira… - 2016 IEEE 9th …, 2016 - ieeexplore.ieee.org
High-velocity data imposes high durability overheads on Big Data technology components
such as NoSQL data stores. In Apache Cassandra, a widely used NoSQL solution with high …

Building Resilient Digital Forensic Frameworks for NoSQL Database: Harnessing the Blockchain and Quantum Technology

RU Rahman, K Singh, DS Tomar… - … Security Practices Using …, 2024 - Springer
Digital forensics is the process of gathering, examining, and presenting digital evidence from
devices like computers, smart phones, and cameras with supporting documentation …