Google Наука

Pbse: A robust path-based speculative execution for degraded-network tail tolerance in data-paral...

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Cluster frameworks for efficient scheduling and resource allocation in data center networks: A survey

K Wang, Q Zhou, S Guo, J Luo - IEEE Communications Surveys …, 2018 - ieeexplore.ieee.org

Data centers are widely used for big data analytics, which often involve data-parallel jobs,
including query and web service. Meanwhile, cluster frameworks are rapidly developed for …

Запазване Позоваване С позовавания в 73 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

The {CASE} of {FEMU}: Cheap, accurate, scalable and extensible flash emulator

H Li, M Hao, MH Tong, S Sundararaman… - … USENIX Conference on …, 2018 - usenix.org

We present FEMU, a QEMU-based flash emulator for fostering future full-stack
software/hardware SSD research, with the following four" CASE" benefits. FEMU is cheap …

Запазване Позоваване С позовавания в 179 Сродни статии Всички 13 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Fail-slow at scale: Evidence of hardware performance faults in large production systems

HS Gunawi, RO Suminto, R Sears, C Golliher… - ACM Transactions on …, 2018 - dl.acm.org

Fail-slow hardware is an under-studied failure mode. We present a study of 114 reports of
fail-slow hardware incidents, collected from large-scale cluster deployments in 14 …

Запазване Позоваване С позовавания в 178 Сродни статии Всички 15 версии

A2tp: Aggregator-aware in-network aggregation for multi-tenant learning

Z Li, J Huang, Y Li, A Xu, S Zhou, J Liu… - Proceedings of the …, 2023 - dl.acm.org

Distributed Machine Learning (DML) techniques are widely used to accelerate the training of
large-scale machine learning models. However, during training iterations, gradients need to …

Запазване Позоваване С позовавания в 23 Сродни статии

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Perseus: A {Fail-Slow} detection framework for cloud storage systems

R Lu, E Xu, Y Zhang, F Zhu, Z Zhu, M Wang… - … USENIX Conference on …, 2023 - usenix.org

The newly-emerging''fail-slow''failures plague both software and hardware where the victim
components are still functioning yet with degraded performance. To address this problem …

Запазване Позоваване С позовавания в 25 Сродни статии Всички 9 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

MittOS: Supporting millisecond tail tolerance with fast rejecting SLO-aware OS interface

M Hao, H Li, MH Tong, C Pakha, RO Suminto… - Proceedings of the 26th …, 2017 - dl.acm.org

MittOS provides operating system support to cut millisecond-level tail latencies for data-
parallel applications. In MittOS, we advocate a new principle that operating system should …

Запазване Позоваване С позовавания в 104 Сродни статии Всички 12 версии

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Managing tail latency in datacenter-scale file systems under production constraints

PA Misra, MF Borge, Í Goiri, AR Lebeck… - Proceedings of the …, 2019 - dl.acm.org

Distributed file systems often exhibit high tail latencies, especially in large-scale datacenters
and in the presence of competing (and possibly higher priority) workloads. This paper …

Запазване Позоваване С позовавания в 59 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{IASO}: A {Fail-Slow} Detection and Mitigation Framework for Distributed Storage Services

B Panda, D Srinivasan, H Ke, K Gupta, V Khot… - 2019 USENIX Annual …, 2019 - usenix.org

We address the problem of “fail-slow” fault, a fault where a hardware or software component
can still function (does not fail-stop) but in much lower performance than expected. To …

Запазване Позоваване С позовавания в 44 Сродни статии Всички 10 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Reducing tail latency using duplication: A multi-layered approach

HM Bashir, AB Faisal, MA Jamshed… - Proceedings of the 15th …, 2019 - dl.acm.org

Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is
often used conservatively because of the risk of overloading the system. We call for making …

Запазване Позоваване С позовавания в 15 Сродни статии Всички 2 версии

[Free GPT-4]
[DeepSeek]

[PDF] uchicago.edu

[PDF][PDF] The University of Chicago

W KIM - 2019 - newtraell.cs.uchicago.edu

ABSTRACT In the Node-Disjoint Paths problem (NDP), the input is an undirected n-vertex
graph G, and a collection {(s1, t1),...,(sk, tk)} of demand pairs. The goal is to route the largest …

Запазване Позоваване С позовавания в 24 Сродни статии Всички 2 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Pbse: A robust path-based speculative execution for degraded-network tail tolerance in data-paral...

Cluster frameworks for efficient scheduling and resource allocation in data center networks: A survey

The {CASE} of {FEMU}: Cheap, accurate, scalable and extensible flash emulator

Fail-slow at scale: Evidence of hardware performance faults in large production systems

A2tp: Aggregator-aware in-network aggregation for multi-tenant learning

Perseus: A {Fail-Slow} detection framework for cloud storage systems

MittOS: Supporting millisecond tail tolerance with fast rejecting SLO-aware OS interface

Managing tail latency in datacenter-scale file systems under production constraints

{IASO}: A {Fail-Slow} Detection and Mitigation Framework for Distributed Storage Services

Reducing tail latency using duplication: A multi-layered approach

[PDF][PDF] The University of Chicago