APGNN: Alarm Propagation Graph Neural Network for fault detection and alarm root cause analysis

W Jiang, Y Bai - Computer Networks, 2023 - Elsevier
Telecommunication network plays an important role in our daily life. Fault detection and
alarm root cause analysis are the keys to ensure the normal operation of the network. To …

TraceRank: Abnormal service localization with dis‐aggregated end‐to‐end tracing data in cloud native systems

G Yu, Z Huang, P Chen - Journal of Software: Evolution and …, 2023 - Wiley Online Library
Modern cloud native applications are generally built with a microservice architecture. To
tackle various performance problems among a large number of services and machines, an …

Themis: A passive-active hybrid framework with in-network intelligence for lightweight failure localization

J **ao, Q Li, D Zhao, X Zuo, W Tang, Y Jiang - Computer Networks, 2024 - Elsevier
The fast and efficient failure detection and localization is essential for stable network
transmission. Unfortunately, existing schemes suffer from a few drawbacks such as …

Link/Switch Failure Analysis of Data Center Networks on Matroidal Connectivity

W Lin, XY Li, JM Chang, X Jia - IEEE Transactions on …, 2025 - ieeexplore.ieee.org
With the surge of bandwidth demand for cloud applications and the exponential growth of
data, data center networks (DCNs) are expanding rapidly, followed by the daily increasing …

Towards automatic root cause diagnosis of persistent packet loss in cloud overlay network

C Fang, H Liu, M Miao, J Ye, L Wang… - IEEE/ACM …, 2022 - ieeexplore.ieee.org
Persistent packet loss in the cloud-scale overlay network severely compromises tenant
experiences. Cloud providers are keen to diagnose such problems efficiently. However …

Themis: A passive-active hybrid framework with in-network intelligence for lightweight failure localization

Q Li, J **ao, D Zhao, X Zuo, W Tang… - Available at SSRN …, 2023 - papers.ssrn.com
The fast and efficient failure detection and localization is essential for stable network
transmission. Unfortunately, existing schemes suffer from a few drawbacks such as …

ER‐Store: A Hybrid Storage Mechanism with Erasure Coding and Replication in Distributed Database Systems

Z Li, C **ao - Scientific Programming, 2021 - Wiley Online Library
In distributed database systems, as cluster scales grow, efficiency and availability become
critical considerations. In a cluster, a common approach to high availability is using …

The network link outlier factor (NLOF) for fault localization

C Mendoza, MP Mcgarry - IEEE Open Journal of the …, 2020 - ieeexplore.ieee.org
We describe and experimentally evaluate the performance of our Network Link Outlier
Factor (NLOF) for locating faults in communication networks. The NLOF is a unique outlier …

Detecting Network Partitioning in Cloud Native 5G Mobile Network Applications

H Bergström, O Fredriksson - 2022 - odr.chalmers.se
With the transition of the 5G core network to a cloud native service-based architecture—
composed of network functions operating through microservices communicating over the …

The Network Link Outlier Factor (NLOF) for Localizing Network Faults

C Mendoza - 2021 - search.proquest.com
This work presents the Network Link Outlier Factor (NLOF), a data analytics pipeline for
network fault detection and localization solution that consists of four stages. In the first stage …