[BOOK][B] Memory systems: cache, DRAM, disk
B Jacob, D Wang, S Ng - 2010 - books.google.com
Is your memory hierarchy stop** your microprocessor from performing at the high level it
should be? Memory Systems: Cache, DRAM, Disk shows you how to resolve this problem …
should be? Memory Systems: Cache, DRAM, Disk shows you how to resolve this problem …
A Comprehensive Survey of Benchmarks for Improvement of Software's Non-Functional Properties
Despite recent increase in research on improvement of non-functional properties of
software, such as energy usage or program size, there is a lack of standard benchmarks for …
software, such as energy usage or program size, there is a lack of standard benchmarks for …
I-spy: Context-driven conditional instruction prefetching with coalescing
Modern data center applications have rapidly expanding instruction footprints that lead to
frequent instruction cache misses, increasing cost and degrading data center performance …
frequent instruction cache misses, increasing cost and degrading data center performance …
Proactive instruction fetch
Fast access requirements preclude building L1 instruction caches large enough to capture
the working set of server workloads. Efforts exist to mitigate limited L1 instruction cache …
the working set of server workloads. Efforts exist to mitigate limited L1 instruction cache …
Twig: Profile-guided btb prefetching for data center applications
Modern data center applications have deep software stacks, with instruction footprints that
are orders of magnitude larger than typical instruction cache (I-cache) sizes. To efficiently …
are orders of magnitude larger than typical instruction cache (I-cache) sizes. To efficiently …
Propeller: A profile guided, relinking optimizer for warehouse-scale applications
While profile guided optimizations (PGO) and link time optimiza-tions (LTO) have been
widely adopted, post link optimizations (PLO) have languished until recently when …
widely adopted, post link optimizations (PLO) have languished until recently when …
Boomerang: A metadata-free architecture for control flow delivery
Contemporary server workloads feature massive instruction footprints stemming from deep,
layered software stacks. The active instruction working set of the entire stack can easily …
layered software stacks. The active instruction working set of the entire stack can easily …
Thermometer: profile-guided btb replacement for data center applications
Modern processors employ a decoupled frontend with Fetch Directed Instruction Prefetching
(FDIP) to avoid frontend stalls in data center applications. However, the large branch …
(FDIP) to avoid frontend stalls in data center applications. However, the large branch …
Temporal instruction fetch streaming
L1 instruction-cache misses pose a critical performance bottleneck in commercial server
workloads. Cache access latency constraints preclude L1 instruction caches large enough …
workloads. Cache access latency constraints preclude L1 instruction caches large enough …
RDIP: Return-address-stack directed instruction prefetching
L1 instruction fetch misses remain a critical performance bottleneck, accounting for up to
40% slowdowns in server applications. Whereas instruction footprints typically fit within last …
40% slowdowns in server applications. Whereas instruction footprints typically fit within last …