Toward the observability of cloud-native applications: The overview of the state-of-the-art

J Kosińska, B Baliś, M Konieczny, M Malawski… - IEEE …, 2023 - ieeexplore.ieee.org
The Cloud-native model, established to enhance the Twelve-Factor patterns, is an approach
to develo** and deploying applications according to DevOps concepts, Continuous …

Autonomous monitors for detecting failures early and reporting interpretable alerts in cloud operations

A Hrusto, P Runeson, MC Ohlsson - Proceedings of the 46th …, 2024 - dl.acm.org
Detecting failures early in cloud-based software systems is highly significant as it can reduce
operational costs, enhance service reliability, and improve user experience. Many existing …

Adaptive feature selection for predicting application performance degradation in edge cloud environments

B Shayesteh, C Fu, A Ebrahimzadeh… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Applications deployed in edge cloud environments can have stringent requirements such as
high throughput and high availability. However, these applications may suffer from …

Canary: fault-tolerant faas for stateful time-sensitive applications

M Arif, K Assogba, MM Rafique - … : International Conference for …, 2022 - ieeexplore.ieee.org
Function-as-a-Service (FaaS) platforms have recently gained rapid popularity. Many stateful
applications have been migrated to FaaS platforms due to their ease of deployment …

Causal-temporal analysis-based feature selection for predicting application performance degradation in edge clouds

B Shayesteh, C Fu, A Ebrahimzadeh… - ICC 2023-IEEE …, 2023 - ieeexplore.ieee.org
Next-generation networks will enable applications that are expected to be highly reliable,
always available, with guaranteed Quality-of-Service (QoS). Distributed, heterogeneous …

Container cascade fault detection based on spatial–temporal correlation in cloud environment

N Chen, Q Zhong, Y Liu, W Liu, L Bai, L Hu - Journal of Cloud Computing, 2023 - Springer
Containers are light, numerous, and interdependent, which are prone to cascading fault,
increasing the probability of fault and the difficulty of detection. Existing detection methods …

Advancing Software Monitoring: An Industry Survey on ML-Driven Alert Management Strategies

A Hrusto, P Runeson, E Engström… - 2024 50th Euromicro …, 2024 - ieeexplore.ieee.org
With the dynamic nature of modern software development and operations environments and
the increasing complexity of cloud-based software systems, traditional monitoring practices …

Smart Quality Monitoring for Evolving Complex Systems

N El Moussa - Proceedings of the 2024 IEEE/ACM 46th International …, 2024 - dl.acm.org
Evolving complex systems, such as complex software systems, dynamic cloud systems and
smart ecosystems, arise from the interactions of systems, agents and people, evolve and …

[LLIBRE][B] An Intelligent Framework for Efficiently Utilizing Distributed Heterogeneous Resources to Improve HPC Application Performance

M Arif - 2024 - search.proquest.com
Abstract High-Performance Computing (HPC) workloads are being widely used to solve
complex problems in scientific applications from diverse domains, such as weather …

Implementation of High-Performance Automated Monitoring Collection Based on Kubernetes

K Li, X **ao, C Gao, S Yu, X Tang… - 2024 3rd International …, 2024 - ieeexplore.ieee.org
Unified configuration and centralized management of monitoring collection are prerequisites
and key elements for achieving automated monitoring in cloud-native platforms. With the …