An improved partitioning mechanism for optimizing massive data analysis using MapReduce

K Slagter, CH Hsu, YC Chung, D Zhang - The Journal of Supercomputing, 2013 - Springer
In the era of Big Data, huge amounts of structured and unstructured data are being produced
daily by a myriad of ubiquitous sources. Big Data is difficult to work with and requires …

An adaptive and memory efficient sampling mechanism for partitioning in MapReduce

K Slagter, CH Hsu, YC Chung - International Journal of Parallel …, 2015 - Springer
Big Data refers to the massive amounts of structured and unstructured data being produced
every day from a wide range of sources. Big Data is difficult to work with and needs a large …

Correlation aware technique for SQL to NoSQL transformation

JC Hsu, CH Hsu, SC Chen… - 2014 7th international …, 2014 - ieeexplore.ieee.org
For better efficiency of parallel and distributed computing, Apache Hadoop distributes the
imported data randomly on data nodes. This mechanism provides some advantages for …

SmartJoin: a network-aware multiway join for MapReduce

K Slagter, CH Hsu, YC Chung, G Yi - Cluster computing, 2014 - Springer
MapReduce is an effective tool for processing large amounts of data in parallel using a
cluster of processors or computers. One common data processing task is the join operation …

[PDF][PDF] Automatic SQL to HQL-NoSQL Querying using PostgreSQL and Integrated Hive-HBase

O Saada, J Daba - WSEAS Trans. Inf. Sci. Appl, 2023 - researchgate.net
The amount of digital data is constantly growing in almost all fields. This data is divided into
two categories, structured and unstructured data. Non-structural databases known as …

SALA: A skew-avoiding and locality-aware algorithm for mapreduce-based join

Z Lin, M Cai, Z Huang, Y Lai - International Conference on Web-Age …, 2015 - Springer
MapReduce is a parallel programming model, which is extensively used to process join
operations for large-scale dataset. However, traditional MapReduce-based join is not …

A SPARQL query processing system using map-phase-multi join for big data in clouds

SW Huang, CH Yu, CK Shieh… - International Journal of …, 2017 - inderscienceonline.com
Big data refers to large datasets which are huge, complex and hard to be stored and
analysed by traditional data processing tools. Linked data is one of the approaches to deal …

[PDF][PDF] A Modified Key Partitioning for BigData Using MapReduce in Hadoop

G Ekambaram, B Palanisamy - Journal of Computer Science, 2015 - Citeseer
In the period of BigData, massive amounts of structured and unstructured data are being
created every day by a multitude of everpresent sources. BigData is complicated to work …

Network-aware multiway join for MapReduce

K Slagter, CH Hsu, YC Chung, JH Park - … , Seoul, Korea, May 9-11, 2013 …, 2013 - Springer
MapReduce is an effective tool for processing large amounts of data in parallel using a
cluster of processors or computers. One common data processing task is the join operation …

[PDF][PDF] A scalable rdf data processing framework based on pig and hadoop

Y Tanimura, S Lynden, A Matono, I Kojima - SWJ. Avril, 2013 - Citeseer
In order to effectively handle the growing amount of available RDF data, scalable and
flexible RDF data processing frameworks are needed. While emerging technologies for Big …