Cluster frameworks for efficient scheduling and resource allocation in data center networks: A survey

K Wang, Q Zhou, S Guo, J Luo - IEEE Communications Surveys …, 2018 - ieeexplore.ieee.org
Data centers are widely used for big data analytics, which often involve data-parallel jobs,
including query and web service. Meanwhile, cluster frameworks are rapidly developed for …

Decentralized task-aware scheduling for data center networks

FR Dogar, T Karagiannis, H Ballani… - ACM SIGCOMM …, 2014 - dl.acm.org
Many data center applications perform rich and complex tasks (eg, executing a search query
or generating a user's news-feed). From a network perspective, these tasks typically …

Large-scale spatial join query processing in cloud

S You, J Zhang, L Gruenwald - 2015 31st IEEE international …, 2015 - ieeexplore.ieee.org
The rapidly increasing amount of location data available in many applications has made it
desirable to process their large-scale spatial queries in Cloud for performance and …

Cloud performance modeling with benchmark evaluation of elastic scaling strategies

K Hwang, X Bai, Y Shi, M Li… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
In this paper, we present generic cloud performance models for evaluating Iaas, PaaS,
SaaS, and mashup or hybrid clouds. We test clouds with real-life benchmark programs and …

{Don't} Get Caught in the Cold, Warm-up Your {JVM}: Understand and Eliminate {JVM} Warm-up Overhead in {Data-Parallel} Systems

D Lion, A Chiu, H Sun, X Zhuang, N Grcevski… - … USENIX Symposium on …, 2016 - usenix.org
Many widely used, latency sensitive, data-parallel distributed systems, such as HDFS, Hive,
and Spark choose to use the Java Virtual Machine (JVM), despite debate on the overhead of …

Improving performance of heterogeneous mapreduce clusters with adaptive task tuning

D Cheng, J Rao, Y Guo, C Jiang… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
Datacenter-scale clusters are evolving toward heterogeneous hardware architectures due to
continuous server replacement. Meanwhile, datacenters are commonly shared by many …

Briskstream: Scaling data stream processing on shared-memory multicore architectures

S Zhang, J He, AC Zhou, B He - … of the 2019 international conference on …, 2019 - dl.acm.org
We introduce BriskStream, an in-memory data stream processing system (DSPSs)
specifically designed for modern shared-memory multicore architectures. BriskStream's key …

Netagg: Using middleboxes for application-specific on-path aggregation in data centres

L Mai, L Rupprecht, A Alim, P Costa… - Proceedings of the 10th …, 2014 - dl.acm.org
Data centre applications for batch processing (eg map/reduce frameworks) and online
services (eg search engines) scale by distributing data and computation across many …

Energy efficiency aware task assignment with dvfs in heterogeneous hadoop clusters

D Cheng, X Zhou, P Lama, M Ji… - Ieee transactions on …, 2017 - ieeexplore.ieee.org
While Hadoop ecosystems become increasingly important for practitioners of large-scale
data analysis, they also incur tremendous energy cost. This trend is driving up the need for …

Text cube: Computing ir measures for multidimensional text database analysis

CX Lin, B Ding, J Han, F Zhu… - 2008 eighth ieee …, 2008 - ieeexplore.ieee.org
Since Jim Gray introduced the concept of rdquodata cuberdquo in 1997, data cube,
associated with online analytical processing (OLAP), has become a driving engine in data …