Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Using machine learning to optimize parallelism in big data applications
In-memory cluster computing platforms have gained momentum in the last years, due to their
ability to analyse big amounts of data in parallel. These platforms are complex and difficult-to …
ability to analyse big amounts of data in parallel. These platforms are complex and difficult-to …
High-performance design of apache spark with RDMA and its benefits on various workloads
The in-memory data processing framework, Apache Spark, has been stealing the limelight
for low-latency interactive applications, iterative and batch computations. Our early …
for low-latency interactive applications, iterative and batch computations. Our early …
Kosmo: Efficient online miss ratio curve generation for eviction policy evaluation
In-memory caches play an important role in reducing the load on backend storage servers
for many workloads. Miss ratio curves (MRCs) are an important tool for configuring these …
for many workloads. Miss ratio curves (MRCs) are an important tool for configuring these …
{ExtMem}: Enabling {Application-Aware} Virtual Memory Management for {Data-Intensive} Applications
S Jalalian, S Patel, MR Hajidehi, M Seltzer… - 2024 USENIX annual …, 2024 - usenix.org
For over forty years, researchers have demonstrated that operating system memory
managers often fall short in supporting memory-hungry applications. The problem is even …
managers often fall short in supporting memory-hungry applications. The problem is even …
LRC: Dependency-aware cache management for data analytics clusters
Memory caches are being aggressively used in today's data-parallel systems such as Spark,
Tez, and Piccolo. However, prevalent systems employ rather simple cache management …
Tez, and Piccolo. However, prevalent systems employ rather simple cache management …
Agile-Ant: Self-managing Distributed Cache Management for Cost Optimization of Big Data Applications
H Al-Sayeh, MA Jibril, KU Sattler - Proceedings of the VLDB Endowment, 2024 - dl.acm.org
Distributed in-memory processing frameworks accelerate application runs by caching
important datasets in memory. Allocating a suitable cluster configuration for caching these …
important datasets in memory. Allocating a suitable cluster configuration for caching these …
Improving spark application throughput via memory aware task co-location: A mixture of experts approach
Data analytic applications built upon big data processing frameworks such as Apache Spark
are an important class of applications. Many of these applications are not latency-sensitive …
are an important class of applications. Many of these applications are not latency-sensitive …
Dynamic memory-aware scheduling in spark computing environment
Scheduling plays an important role in improving the performance of big data-parallel
processing. Spark is an in-memory parallel computing framework that uses a multi-threaded …
processing. Spark is an in-memory parallel computing framework that uses a multi-threaded …
Reference-distance eviction and prefetching for cache management in spark
Optimizing memory cache usage is vital for performance of in-memory data-parallel
frameworks such as Spark. Current data-analytic frameworks utilize the popular Least …
frameworks such as Spark. Current data-analytic frameworks utilize the popular Least …
Intermediate data caching optimization for multi-stage and parallel big data frameworks
Z Yang, D Jia, S Ioannidis, N Mi… - 2018 IEEE 11th …, 2018 - ieeexplore.ieee.org
In the era of big data and cloud computing, large amounts of data are generated from user
applications and need to be processed in the datacenter. Data-parallel computing …
applications and need to be processed in the datacenter. Data-parallel computing …