- Academic Search

S Zhuravlev, JC Saez, S Blagodurov… - ACM Computing …, 2012 - dl.acm.org

Chip multicore processors (CMPs) have emerged as the dominant architecture choice for
modern computing platforms and will most likely continue to be dominant well into the …

Speichern Zitieren Zitiert von: 234 Ähnliche Artikel Alle 9 Versionen

[Free GPT-4]

[PDF] usenix.org

Fairness in serving large language models

Y Sheng, S Cao, D Li, B Zhu, Z Li, D Zhuo… - … USENIX Symposium on …, 2024 - usenix.org

High-demand LLM inference services (eg, ChatGPT and BARD) support a wide range of
requests from short chat conversations to long document reading. To ensure that all client …

Speichern Zitieren Zitiert von: 39 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] acm.org

Heracles: Improving resource efficiency at scale

D Lo, L Cheng, R Govindaraju… - Proceedings of the …, 2015 - dl.acm.org

User-facing, latency-sensitive services, such as websearch, underutilize their computing
resources during daily periods of low traffic. Reusing those resources for other tasks is rarely …

Speichern Zitieren Zitiert von: 696 Ähnliche Artikel Alle 21 Versionen

[Free GPT-4]

[PDF] umich.edu

Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations

J Mars, L Tang, R Hundt, K Skadron… - Proceedings of the 44th …, 2011 - dl.acm.org

As much of the world's computing continues to move into the cloud, the overprovisioning of
computing resources to ensure the performance isolation of latency-sensitive tasks, such as …

Speichern Zitieren Zitiert von: 792 Ähnliche Artikel Alle 20 Versionen

[Free GPT-4]

[PDF] cmu.edu

A case for exploiting subarray-level parallelism (SALP) in DRAM

Y Kim, V Seshadri, D Lee, J Liu, O Mutlu - ACM SIGARCH Computer …, 2012 - dl.acm.org

Modern DRAMs have multiple banks to serve multiple memory requests in parallel.
However, when two requests go to the same bank, they have to be served serially …

Speichern Zitieren Zitiert von: 486 Ähnliche Artikel Alle 22 Versionen

[Free GPT-4]

[PDF] psu.edu

Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms

H Yun, G Yao, R Pellizzoni… - 2013 IEEE 19th Real …, 2013 - ieeexplore.ieee.org

Memory bandwidth in modern multi-core platforms is highly variable for many reasons and is
a big challenge in designing real-time systems as applications are increasingly becoming …

Speichern Zitieren Zitiert von: 441 Ähnliche Artikel Alle 10 Versionen

[Free GPT-4]

[PDF] cmu.edu

Thread cluster memory scheduling: Exploiting differences in memory access behavior

Y Kim, M Papamichael, O Mutlu… - 2010 43rd Annual …, 2010 - ieeexplore.ieee.org

In a modern chip-multiprocessor system, memory is a shared resource among multiple
concurrently executing threads. The memory scheduling algorithm should resolve memory …

Speichern Zitieren Zitiert von: 572 Ähnliche Artikel Alle 28 Versionen

[Free GPT-4]

[PDF] cam.ac.uk

Self-optimizing memory controllers: A reinforcement learning approach

E Ipek, O Mutlu, JF Martínez, R Caruana - ACM SIGARCH Computer …, 2008 - dl.acm.org

Efficiently utilizing off-chip DRAM bandwidth is a critical issuein designing cost-effective,
high-performance chip multiprocessors (CMPs). Conventional memory controllers deliver …

Speichern Zitieren Zitiert von: 656 Ähnliche Artikel Alle 20 Versionen

[Free GPT-4]

[PDF] psu.edu

Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems

O Mutlu, T Moscibroda - ACM SIGARCH Computer Architecture News, 2008 - dl.acm.org

In a chip-multiprocessor (CMP) system, the DRAM system isshared among cores. In a
shared DRAM system, requests from athread can not only delay requests from other threads …

Speichern Zitieren Zitiert von: 762 Ähnliche Artikel Alle 10 Versionen

[Free GPT-4]

[PDF] cmu.edu

ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers

Y Kim, D Han, O Mutlu… - HPCA-16 2010 The …, 2010 - ieeexplore.ieee.org

Modern chip multiprocessor (CMP) systems employ multiple memory controllers to control
access to main memory. The scheduling algorithm employed by these memory controllers …

Speichern Zitieren Zitiert von: 581 Ähnliche Artikel Alle 24 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Fair queuing memory systems

Survey of scheduling techniques for addressing shared resources in multicore processors

Fairness in serving large language models

Heracles: Improving resource efficiency at scale

Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations

A case for exploiting subarray-level parallelism (SALP) in DRAM

Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms

Thread cluster memory scheduling: Exploiting differences in memory access behavior

Self-optimizing memory controllers: A reinforcement learning approach

Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems

ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers