Clearing the clouds: a study of emerging scale-out workloads on modern hardware
Emerging scale-out workloads require extensive amounts of computational resources.
However, data centers using modern server hardware face physical constraints in space …
However, data centers using modern server hardware face physical constraints in space …
Cache craftiness for fast multicore key-value storage
We present Masstree, a fast key-value database designed for SMP machines. Masstree
keeps all data in memory. Its main data structure is a trie-like concatenation of B+-trees …
keeps all data in memory. Its main data structure is a trie-like concatenation of B+-trees …
Toward dark silicon in servers
Server chips will not scale beyond a few tens to low hundreds of cores, and an increasing
fraction of the chip in future technologies will be dark silicon that we cannot afford to power …
fraction of the chip in future technologies will be dark silicon that we cannot afford to power …
Sort vs. hash revisited: Fast join implementation on modern multi-core CPUs
Join is an important database operation. As computer architectures evolve, the best join
algorithm may change hand. This paper re-examines two popular join algorithms--hash join …
algorithm may change hand. This paper re-examines two popular join algorithms--hash join …
Reactive NUCA: near-optimal block placement and replication in distributed caches
Increases in on-chip communication delay and the large working sets of server and scientific
workloads complicate the design of the on-chip last-level cache for multicore processors …
workloads complicate the design of the on-chip last-level cache for multicore processors …
Relational joins on graphics processors
We present a novel design and implementation of relational join algorithms for new-
generation graphics processing units (GPUs). The most recent GPU features include support …
generation graphics processing units (GPUs). The most recent GPU features include support …
Spatial memory streaming
Prior research indicates that there is much spatial variation in applications' memory access
patterns. Modern memory systems, however, use small fixed-size cache blocks and as such …
patterns. Modern memory systems, however, use small fixed-size cache blocks and as such …
Scale-out processors
Scale-out datacenters mandate high per-server throughput to get the maximum benefit from
the large TCO investment. Emerging applications (eg, data serving and web search) that run …
the large TCO investment. Emerging applications (eg, data serving and web search) that run …
Evaluation of hardware data prefetchers on server processors
Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …
fetching those that are not in the on-chip caches, is a well-known and widely used approach …
The impact of memory subsystem resource sharing on datacenter applications
In this paper we study the impact of sharing memory resources on five Google datacenter
applications: a web search engine, bigtable, content analyzer, image stitching, and protocol …
applications: a web search engine, bigtable, content analyzer, image stitching, and protocol …