PINT: Probabilistic in-band network telemetry

R Ben Basat, S Ramanathan, Y Li, G Antichi… - Proceedings of the …, 2020 - dl.acm.org
Commodity network devices support adding in-band telemetry measurements into data
packets, enabling a wide range of applications, including network troubleshooting …

CocoSketch: High-performance sketch-based measurement over arbitrary partial key query

Y Zhang, Z Liu, R Wang, T Yang, J Li, R Miao… - Proceedings of the …, 2021 - dl.acm.org
Sketch-based measurement has emerged as a promising alternative to the traditional
sampling-based network measurement approaches due to its high accuracy and resource …

Flow event telemetry on programmable data plane

Y Zhou, C Sun, HH Liu, R Miao, S Bai, B Li… - Proceedings of the …, 2020 - dl.acm.org
Network performance anomalies (NPAs), eg long-tailed latency, bandwidth decline, etc., are
increasingly crucial to cloud providers as applications are getting more sensitive to …

Omnimon: Re-architecting network telemetry with resource efficiency and full accuracy

Q Huang, H Sun, PPC Lee, W Bai, F Zhu… - Proceedings of the Annual …, 2020 - dl.acm.org
Network telemetry is essential for administrators to monitor massive data traffic in a network-
wide manner. Existing telemetry solutions often face the dilemma between resource …

When cloud storage meets {RDMA}

Y Gao, Q Li, L Tang, Y **, P Zhang, W Peng… - … USENIX Symposium on …, 2021 - usenix.org
A production-level cloud storage system must be high performing and readily available. It
should also meet a Service-Level Agreement (SLA). The rapid advancement in storage …

Toward {Nearly-Zero-Error} sketching via compressive sensing

Q Huang, S Sheng, X Chen, Y Bao, R Zhang… - … USENIX Symposium on …, 2021 - usenix.org
Sketch algorithms have been extensively studied in the area of network measurement, given
their limited resource usage and theoretically bounded errors. However, error bounds …

Bitsense: Universal and nearly zero-error optimization for sketch counters with compressive sensing

R Ding, S Yang, X Chen, Q Huang - Proceedings of the ACM SIGCOMM …, 2023 - dl.acm.org
Sketch algorithms have been widely deployed for network measurement as they achieve
high accuracy with restricted resource usage. They store measurement results compactly in …

Perseus: A {Fail-Slow} detection framework for cloud storage systems

R Lu, E Xu, Y Zhang, F Zhu, Z Zhu, M Wang… - … USENIX Conference on …, 2023 - usenix.org
The newly-emerging''fail-slow''failures plague both software and hardware where the victim
components are still functioning yet with degraded performance. To address this problem …

{SIMON}: A simple and scalable method for sensing, inference and measurement in data center networks

Y Geng, S Liu, Z Yin, A Naik, B Prabhakar… - … USENIX Symposium on …, 2019 - usenix.org
Network measurement and monitoring have been key to understanding the inner workings
of computer networks and debugging the performance problems of distributed applications …

Network telemetry: towards a top-down approach

M Yu - ACM SIGCOMM Computer Communication Review, 2019 - dl.acm.org
Network telemetry is about understanding what is happening in the current network. It serves
as the basis for making a variety of management decisions for improving the performance …