Software-defined networking: A comprehensive survey

D Kreutz, FMV Ramos, PE Verissimo… - Proceedings of the …, 2014 - ieeexplore.ieee.org
The Internet has led to the creation of a digital society, where (almost) everything is
connected and is accessible from anywhere. However, despite their widespread adoption …

A survey on observability of distributed edge & container-based microservices

M Usman, S Ferlin, A Brunstrom, J Taheri - IEEE Access, 2022 - ieeexplore.ieee.org
Edge computing is proposed as a technical enabler for meeting emerging network
technologies (such as 5G and Industrial Internet of Things), stringent application …

{FIRM}: An intelligent fine-grained resource management framework for {SLO-Oriented} microservices

H Qiu, SS Banerjee, S Jha, ZT Kalbarczyk… - 14th USENIX symposium …, 2020 - usenix.org
User-facing latency-sensitive web services include numerous distributed,
intercommunicating microservices that promise to simplify software development and …

Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices

Y Gan, Y Zhang, K Hu, D Cheng, Y He… - Proceedings of the …, 2019 - dl.acm.org
Performance unpredictability is a major roadblock towards cloud adoption, and has
performance, cost, and revenue ramifications. Predictable performance is even more critical …

Sage: practical and scalable ML-driven performance debugging in microservices

Y Gan, M Liang, S Dev, D Lo, C Delimitrou - Proceedings of the 26th …, 2021 - dl.acm.org
Cloud applications are increasingly shifting from large monolithic services to complex
graphs of loosely-coupled microservices. Despite the advantages of modularity and …

[PDF][PDF] Maglev: A fast and reliable software network load balancer.

DE Eisenbud, C Yi, C Contavalli, C Smith, R Kononov… - Nsdi, 2016 - usenix.org
Maglev is Google's network load balancer. It is a large distributed software system that runs
on commodity Linux servers. Unlike traditional hardware network load balancers, it does not …

Packet-level telemetry in large datacenter networks

Y Zhu, N Kang, J Cao, A Greenberg, G Lu… - Proceedings of the …, 2015 - dl.acm.org
Debugging faults in complex networks often requires capturing and analyzing traffic at the
packet level. In this task, datacenter networks (DCNs) present unique challenges with their …

Microrca: Root cause localization of performance issues in microservices

L Wu, J Tordsson, E Elmroth… - NOMS 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org
Software architecture is undergoing a transition from monolithic architectures to
microservices to achieve resilience, agility and scalability in software development …

Pivot tracing: Dynamic causal monitoring for distributed systems

J Mace, R Roelke, R Fonseca - ACM Transactions on Computer Systems …, 2018 - dl.acm.org
Monitoring and troubleshooting distributed systems is notoriously difficult; potential problems
are complex, varied, and unpredictable. The monitoring and diagnosis tools commonly used …

[LIBRO][B] The datacenter as a computer: An introduction to the design of warehouse-scale machines

LA Barroso, J Clidaras - 2022 - books.google.com
As computation continues to move into the cloud, the computing platform of interest no
longer resembles a pizza box or a refrigerator, but a warehouse full of computers. These …