A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
Big Data analytics for storing, processing, and analyzing large-scale datasets has become
an essential tool for the industry. The advent of distributed computing frameworks such as …
an essential tool for the industry. The advent of distributed computing frameworks such as …
[PDF][PDF] A parallel grid optimization of SVM hyperparameter for big data classification using spark Radoop
The big data phenomenon is currently a challenge to the process of relevant knowledge
extraction using classical machine learning technique. This is due to the need for efficient …
extraction using classical machine learning technique. This is due to the need for efficient …
Compiling data-parallel datalog
Datalog allows intuitive declarative specification of logical inference tasks while enjoying
efficient implementation via state-of-the-art engines such as LogicBlox and Soufflé. These …
efficient implementation via state-of-the-art engines such as LogicBlox and Soufflé. These …
The Parallel Fuzzy C-Median Clustering Algorithm Using Spark for the Big Data
MA Mallik, NF Zulkurnain, S Siddiqui, R Sarkar - IEEE Access, 2024 - ieeexplore.ieee.org
Big data for sustainable development is a global issue due to the explosive growth of data
and according to the forecasting of International Data Corporation (IDC), the amount of data …
and according to the forecasting of International Data Corporation (IDC), the amount of data …
An Earlier Experiences Towards Optimizing Apache Spark Over Frontera Supercomputer
Apache Spark has become a very popular computing engine that allows distributing
computing tasks on a compute cluster. However, the current approaches lack necessary …
computing tasks on a compute cluster. However, the current approaches lack necessary …
[PDF][PDF] Containerization vs Bare Metal: distributed computing performance using Apache Spark
ΜΕ Τσαρμποπούλου - 2024 - dspace.lib.ntua.gr
This research explores the performance trade-offs between containerized and bare metal
environments for running Apache Spark applications, specifically focusing on incident …
environments for running Apache Spark applications, specifically focusing on incident …
Big Data and machine learning to improve medical monitoring and remote monitoring
AKG Escamilla - 2020 - theses.hal.science
In order to improve heart diseases care, and heart failure disease more specifically,
particularly for patients with NYHA (New-York Heart Association) stage III/IV, the most …
particularly for patients with NYHA (New-York Heart Association) stage III/IV, the most …