Anomaly detection and failure root cause analysis in (micro) service-based cloud applications: A survey

J Soldani, A Brogi - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
The proliferation of services and service interactions within microservices and cloud-native
applications, makes it harder to detect failures and to identify their possible root causes …

Ai for it operations (aiops) on cloud platforms: Reviews, opportunities and challenges

Q Cheng, D Sahoo, A Saha, W Yang, C Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big
data generated by IT Operations processes, particularly in cloud infrastructures, to provide …

[HTML][HTML] A causality mining and knowledge graph based method of root cause diagnosis for performance anomaly in cloud applications

J Qiu, Q Du, K Yin, SL Zhang, C Qian - Applied Sciences, 2020 - mdpi.com
With the development of cloud computing technology, the microservice architecture (MSA)
has become a prevailing application architecture in cloud-native applications. Many user …

Failure diagnosis in microservice systems: A comprehensive survey and analysis

S Zhang, S **a, W Fan, B Shi, X **ong… - ACM Transactions on …, 2024 - dl.acm.org
Widely adopted for their scalability and flexibility, modern microservice systems present
unique failure diagnosis challenges due to their independent deployment and dynamic …

Detecting anomalies in microservices with execution trace comparison

L Meng, F Ji, Y Sun, T Wang - Future Generation Computer Systems, 2021 - Elsevier
More and more developers and companies have adopted the concept of microservice.
Detecting anomalies and locating root causes are important for improving the reliability of …

Mining root cause knowledge from cloud service incident investigations for aiops

A Saha, SCH Hoi - Proceedings of the 44th international conference on …, 2022 - dl.acm.org
Root Cause Analysis (RCA) of any service-disrupting incident is one of the most critical as
well as complex tasks in IT processes, especially for cloud industry leaders like Salesforce …

Faster, deeper, easier: crowdsourcing diagnosis of microservice kernel failure from user space

Y Pan, M Ma, X Jiang, P Wang - Proceedings of the 30th ACM SIGSOFT …, 2021 - dl.acm.org
With the widespread use of cloud-native architecture, increasing web applications (apps)
choose to build on microservices. Simultaneously, troubleshooting becomes full of …

An anomaly detection algorithm for microservice architecture based on robust principal component analysis

M **, A Lv, Y Zhu, Z Wen, Y Zhong, Z Zhao, J Wu… - IEEE …, 2020 - ieeexplore.ieee.org
Microservice architecture (MSA) is a new software architecture, which divides a large single
application and service into dozens of supporting microservices. With the increasingly …

Look deep into the microservice system anomaly through very sparse logs

X Jiang, Y Pan, M Ma, P Wang - … of the ACM Web Conference 2023, 2023 - dl.acm.org
Intensive monitoring and anomaly diagnosis have become a knotty problem for modern
microservice architecture due to the dynamics of service dependency. While most previous …

A survey on AI for storage

Y Liu, H Wang, K Zhou, CH Li, R Wu - CCF Transactions on High …, 2022 - Springer
Storage, as a core function and fundamental component of computers, provides services for
saving and reading digital data. The increasing complexity of data operations and storage …