A systematic literature review on automated log abstraction techniques
Context: Logs are often the first and only information available to software engineers to
understand and debug their systems. Automated log-analysis techniques help software …
understand and debug their systems. Automated log-analysis techniques help software …
Log clustering based problem identification for online service systems
Logs play an important role in the maintenance of large-scale online service systems. When
an online service fails, engineers need to examine recorded logs to gain insights into the …
an online service fails, engineers need to examine recorded logs to gain insights into the …
Logram: Efficient Log Parsing Using -Gram Dictionaries
Software systems usually record important runtime information in their logs. Logs help
practitioners understand system runtime behaviors and diagnose field failures. As logs are …
practitioners understand system runtime behaviors and diagnose field failures. As logs are …
Leveraging existing instrumentation to automatically infer invariant-constrained models
Computer systems are often difficult to debug and understand. A common way of gaining
insight into system behavior is to inspect execution logs and documentation. Unfortunately …
insight into system behavior is to inspect execution logs and documentation. Unfortunately …
An improved KNN-based efficient log anomaly detection method with automatically labeled samples
S Ying, B Wang, L Wang, Q Li, Y Zhao… - ACM Transactions on …, 2021 - dl.acm.org
Logs that record system abnormal states (anomaly logs) can be regarded as outliers, and
the k-Nearest Neighbor (kNN) algorithm has relatively high accuracy in outlier detection …
the k-Nearest Neighbor (kNN) algorithm has relatively high accuracy in outlier detection …
Debugging distributed systems
Debugging distributed systems Page 1 32 COMMUNICATIONS OF THE ACM | AUGUST 2016 |
VOL. 59 | NO. 8 practice DOI:10.1145/2909480 Article development led by queue.acm.org …
VOL. 59 | NO. 8 practice DOI:10.1145/2909480 Article development led by queue.acm.org …
Visualizing distributed system executions
Distributed systems pose unique challenges for software developers. Understanding the
system's communication topology and reasoning about concurrent activities of system hosts …
system's communication topology and reasoning about concurrent activities of system hosts …
Online anomaly detection in hpc systems
Reliability is a cumbersome problem in High Performance Computing Systems and Data
Centers evolution. During operation, several types of fault conditions or anomalies can arise …
Centers evolution. During operation, several types of fault conditions or anomalies can arise …
Behavioral resource-aware model inference
T Ohmann, M Herzberg, S Fiss, A Halbert… - Proceedings of the 29th …, 2014 - dl.acm.org
Software bugs often arise because of differences between what developers think their
system does and what the system actually does. These differences frustrate debugging and …
system does and what the system actually does. These differences frustrate debugging and …
Using declarative specification to improve the understanding, extensibility, and comparison of model-inference algorithms
It is a staple development practice to log system behavior. Numerous powerful model-
inference algorithms have been proposed to aid developers in log analysis and system …
inference algorithms have been proposed to aid developers in log analysis and system …