Clearing the clouds: a study of emerging scale-out workloads on modern hardware

M Ferdman, A Adileh, O Kocberber, S Volos… - Acm sigplan …, 2012 - dl.acm.org
Emerging scale-out workloads require extensive amounts of computational resources.
However, data centers using modern server hardware face physical constraints in space …

Cache craftiness for fast multicore key-value storage

Y Mao, E Kohler, RT Morris - Proceedings of the 7th ACM european …, 2012 - dl.acm.org
We present Masstree, a fast key-value database designed for SMP machines. Masstree
keeps all data in memory. Its main data structure is a trie-like concatenation of B+-trees …

Toward dark silicon in servers

N Hardavellas, M Ferdman, B Falsafi, A Ailamaki - IEEE Micro, 2011 - ieeexplore.ieee.org
Server chips will not scale beyond a few tens to low hundreds of cores, and an increasing
fraction of the chip in future technologies will be dark silicon that we cannot afford to power …

Sort vs. hash revisited: Fast join implementation on modern multi-core CPUs

C Kim, T Kaldewey, VW Lee, E Sedlar… - Proceedings of the …, 2009 - dl.acm.org
Join is an important database operation. As computer architectures evolve, the best join
algorithm may change hand. This paper re-examines two popular join algorithms--hash join …

Reactive NUCA: near-optimal block placement and replication in distributed caches

N Hardavellas, M Ferdman, B Falsafi… - Proceedings of the 36th …, 2009 - dl.acm.org
Increases in on-chip communication delay and the large working sets of server and scientific
workloads complicate the design of the on-chip last-level cache for multicore processors …

Relational joins on graphics processors

B He, K Yang, R Fang, M Lu, N Govindaraju… - Proceedings of the …, 2008 - dl.acm.org
We present a novel design and implementation of relational join algorithms for new-
generation graphics processing units (GPUs). The most recent GPU features include support …

Spatial memory streaming

S Somogyi, TF Wenisch, A Ailamaki, B Falsafi… - ACM SIGARCH …, 2006 - dl.acm.org
Prior research indicates that there is much spatial variation in applications' memory access
patterns. Modern memory systems, however, use small fixed-size cache blocks and as such …

Scale-out processors

P Lotfi-Kamran, B Grot, M Ferdman, S Volos… - ACM SIGARCH …, 2012 - dl.acm.org
Scale-out datacenters mandate high per-server throughput to get the maximum benefit from
the large TCO investment. Emerging applications (eg, data serving and web search) that run …

Evaluation of hardware data prefetchers on server processors

M Bakhshalipour, S Tabaeiaghdaei… - ACM Computing …, 2019 - dl.acm.org
Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …

The impact of memory subsystem resource sharing on datacenter applications

L Tang, J Mars, N Vachharajani, R Hundt… - ACM SIGARCH …, 2011 - dl.acm.org
In this paper we study the impact of sharing memory resources on five Google datacenter
applications: a web search engine, bigtable, content analyzer, image stitching, and protocol …