[PDF][PDF] 云计算研究现状综述

**乔, 郑啸 - 计算机科学, 2011 - cicpa.org.cn
摘要云计算能够给用户提供可靠的, 自定义的, 最大化资源利用的服务, 是一种崭新的分布式计算
模式. 同时, 云计算和其他技术及理论的有机结合, 也是解决理论研究和实际应用的重要途径 …

Outlier detection for temporal data: A survey

M Gupta, J Gao, CC Aggarwal… - IEEE Transactions on …, 2013 - ieeexplore.ieee.org
In the statistics community, outlier detection for time series data has been studied for
decades. Recently, with advances in hardware and software technology, there has been a …

An analysis of traces from a production mapreduce cluster

S Kavulya, J Tan, R Gandhi… - 2010 10th IEEE/ACM …, 2010 - ieeexplore.ieee.org
MapReduce is a programming paradigm for parallel processing that is increasingly being
used for data-intensive applications in cloud computing environments. An understanding of …

Improving MapReduce performance using smart speculative execution strategy

Q Chen, C Liu, Z **ao - IEEE Transactions on Computers, 2013 - ieeexplore.ieee.org
MapReduce is a widely used parallel computing framework for large scale data processing.
The two major performance metrics in MapReduce are job execution time and cluster …

[PDF][PDF] Diagnosing performance changes by comparing request flows

RR Sambasivan, AX Zheng, M De Rosa… - … USENIX Symposium on …, 2011 - usenix.org
The causes of performance changes in a distributed system often elude even its developers.
This paper develops a new technique for gaining insight into such changes: comparing …

Localizing faults in cloud systems

L Mariani, C Monni, M Pezzé… - 2018 IEEE 11th …, 2018 - ieeexplore.ieee.org
By leveraging large clusters of commodity hardware, the Cloud offers great opportunities to
optimize the operative costs of software systems, but impacts significantly on the reliability of …

[HTML][HTML] Autonomous anomaly detection on traffic flow time series with reinforcement learning

D He, J Kim, H Shi, B Ruan - Transportation Research Part C: Emerging …, 2023 - Elsevier
This study develops an autonomous artificial intelligence (AI) agent to detect anomalies in
traffic flow time series data, which can learn anomaly patterns from data without supervision …

DTAAD: Dual TCN-attention networks for anomaly detection in multivariate time series data

L Yu, Q Lu, Y Xue - Knowledge-Based Systems, 2024 - Elsevier
Anomaly detection techniques enable effective anomaly detection and diagnosis in multi-
variate time series data, which are of major significance for today's industrial applications …

Failure analysis of jobs in compute clouds: A google cluster case study

X Chen, CD Lu, K Pattabiraman - 2014 IEEE 25th International …, 2014 - ieeexplore.ieee.org
In this paper, we analyze a workload trace from the Google cloud cluster and characterize
the observed failures. The goal of our work is to improve the understanding of failures in …

[HTML][HTML] Machine learning job failure analysis and prediction model for the cloud environment

H Bommala, R Aluvalu, S Mudrakola - High-Confidence Computing, 2023 - Elsevier
Reliable and accessible cloud applications are essential for the future of ubiquitous
computing, smart appliances, and electronic health. Owing to the vastness and diversity of …