[HTML][HTML] Deep learning for anomaly detection in log data: A survey

M Landauer, S Onder, F Skopik… - Machine Learning with …, 2023 - Elsevier
Automatic log file analysis enables early detection of relevant incidents such as system
failures. In particular, self-learning anomaly detection techniques capture patterns in log …

Performance anomaly detection and bottleneck identification

O Ibidunmoye, F Hernández-Rodriguez… - ACM Computing Surveys …, 2015 - dl.acm.org
In order to meet stringent performance requirements, system administrators must effectively
detect undesirable performance behaviours, identify potential root causes, and take …

Log-based anomaly detection without log parsing

VH Le, H Zhang - … 36th IEEE/ACM International Conference on …, 2021 - ieeexplore.ieee.org
Software systems often record important runtime information in system logs for
troubleshooting purposes. There have been many studies that use log data to construct …

Semi-supervised log-based anomaly detection via probabilistic label estimation

L Yang, J Chen, Z Wang, W Wang… - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org
With the growth of software systems, logs have become an important data to aid system
maintenance. Log-based anomaly detection is one of the most important methods for such …

Robust log-based anomaly detection on unstable log data

X Zhang, Y Xu, Q Lin, B Qiao, H Zhang… - Proceedings of the …, 2019 - dl.acm.org
Logs are widely used by large and complex software-intensive systems for troubleshooting.
There have been a lot of studies on log-based anomaly detection. To detect the anomalies …

Experience report: System log analysis for anomaly detection

S He, J Zhu, P He, MR Lyu - 2016 IEEE 27th international …, 2016 - ieeexplore.ieee.org
Anomaly detection plays an important role in management of modern large-scale distributed
systems. Logs, which record system runtime information, are widely used for anomaly …

A survey of aiops methods for failure management

P Notaro, J Cardoso, M Gerndt - ACM Transactions on Intelligent …, 2021 - dl.acm.org
Modern society is increasingly moving toward complex and distributed computing systems.
The increase in scale and complexity of these systems challenges O&M teams that perform …

Detecting large-scale system problems by mining console logs

W Xu, L Huang, A Fox, D Patterson… - Proceedings of the ACM …, 2009 - dl.acm.org
Surprisingly, console logs rarely help operators detect problems in large-scale datacenter
services, for they often consist of the voluminous intermixing of messages from many …

Microscope: Pinpoint performance issues with causal graphs in micro-service environments

JJ Lin, P Chen, Z Zheng - … , ICSOC 2018, Hangzhou, China, November 12 …, 2018 - Springer
Driven by the emerging business models (eg, digital sales) and IT technologies (eg, DevOps
and Cloud computing), the architecture of software is shifting from monolithic to microservice …

Counterfactual explanations for multivariate time series

E Ates, B Aksar, VJ Leung… - … conference on applied …, 2021 - ieeexplore.ieee.org
Multivariate time series are used in many science and engineering domains, including
health-care, astronomy, and high-performance computing. A recent trend is to use machine …