[BOOK][B] An introduction to outlier analysis
CC Aggarwal, CC Aggarwal - 2017 - Springer
Outliers are also referred to as abnormalities, discordants, deviants, or anomalies in the data
mining and statistics literature. In most applications, the data is created by one or more …
mining and statistics literature. In most applications, the data is created by one or more …
A survey on automated log analysis for reliability engineering
Logs are semi-structured text generated by logging statements in software source code. In
recent decades, software logs have become imperative in the reliability assurance …
recent decades, software logs have become imperative in the reliability assurance …
A survey of online failure prediction methods
F Salfner, M Lenk, M Malek - ACM Computing Surveys (CSUR), 2010 - dl.acm.org
With the ever-growing complexity and dynamicity of computer systems, proactive fault
management is an effective approach to enhancing availability. Online failure prediction is …
management is an effective approach to enhancing availability. Online failure prediction is …
Experience report: Deep learning-based system log analysis for anomaly detection
Logs have been an imperative resource to ensure the reliability and continuity of many
software systems, especially large-scale distributed systems. They faithfully record runtime …
software systems, especially large-scale distributed systems. They faithfully record runtime …
What supercomputers say: A study of five system logs
If we hope to automatically detect and diagnose failures in large-scale computer systems,
we must study real deployed systems and the data they generate. Progress has been …
we must study real deployed systems and the data they generate. Progress has been …
Informed haar-like features improve pedestrian detection
We propose a simple yet effective detector for pedestrian detection. The basic idea is to
incorporate common sense and everyday knowledge into the design of simple and …
incorporate common sense and everyday knowledge into the design of simple and …
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of
processors. At such large counts of compute nodes, faults are becoming common place …
processors. At such large counts of compute nodes, faults are becoming common place …
Failure prediction in ibm bluegene/l event logs
Frequent failures are becoming a serious concern to the community of high-end computing,
especially when the applications and the underlying systems rapidly grow in size and …
especially when the applications and the underlying systems rapidly grow in size and …
Ai for it operations (aiops) on cloud platforms: Reviews, opportunities and challenges
Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big
data generated by IT Operations processes, particularly in cloud infrastructures, to provide …
data generated by IT Operations processes, particularly in cloud infrastructures, to provide …
Bluegene/l failure analysis and prediction models
The growing computational and storage needs of several scientific applications mandate the
deployment of extreme-scale parallel machines, such as IBM's BlueGene/L which can …
deployment of extreme-scale parallel machines, such as IBM's BlueGene/L which can …