A survey on deep learning for software engineering
In 2006, Geoffrey Hinton proposed the concept of training “Deep Neural Networks (DNNs)”
and an improved model training method to break the bottleneck of neural network …
and an improved model training method to break the bottleneck of neural network …
Semi-supervised log-based anomaly detection via probabilistic label estimation
With the growth of software systems, logs have become an important data to aid system
maintenance. Log-based anomaly detection is one of the most important methods for such …
maintenance. Log-based anomaly detection is one of the most important methods for such …
Recommending root-cause and mitigation steps for cloud incidents using large language models
Incident management for cloud services is a complex process involving several steps and
has a huge impact on both service health and developer productivity. On-call engineers …
has a huge impact on both service health and developer productivity. On-call engineers …
Deep learning library testing via effective model generation
Deep learning (DL) techniques are rapidly developed and have been widely adopted in
practice. However, similar to traditional software systems, DL systems also contain bugs …
practice. However, similar to traditional software systems, DL systems also contain bugs …
Prioritizing test inputs for deep neural networks via mutation analysis
Deep Neural Network (DNN) testing is one of the most widely-used ways to guarantee the
quality of DNNs. However, labeling test inputs to check the correctness of DNN prediction is …
quality of DNNs. However, labeling test inputs to check the correctness of DNN prediction is …
Automated root causing of cloud incidents using in-context learning with GPT-4
Root Cause Analysis (RCA) plays a pivotal role in the incident diagnosis process for cloud
services, requiring on-call engineers to identify the primary issues and implement corrective …
services, requiring on-call engineers to identify the primary issues and implement corrective …
Towards intelligent incident management: why we need it and how we make it
The management of cloud service incidents (unplanned interruptions or outages of a
service/product) greatly affects customer satisfaction and business revenue. After years of …
service/product) greatly affects customer satisfaction and business revenue. After years of …
Muffin: Testing deep learning libraries via neural architecture fuzzing
Deep learning (DL) techniques are proven effective in many challenging tasks, and become
widely-adopted in practice. However, previous work has shown that DL libraries, the basis of …
widely-adopted in practice. However, previous work has shown that DL libraries, the basis of …
Monitorassistant: Simplifying cloud service monitoring via large language models
In large-scale cloud service systems, monitoring metric data and conducting anomaly
detection is an important way to maintain reliability and stability. However, great disparity …
detection is an important way to maintain reliability and stability. However, great disparity …
Assess and summarize: Improve outage understanding with large language models
Cloud systems have become increasingly popular in recent years due to their flexibility and
scalability. Each time cloud computing applications and services hosted on the cloud are …
scalability. Each time cloud computing applications and services hosted on the cloud are …