A software engineering perspective on engineering machine learning systems: State of the art and challenges
G Giray - Journal of Systems and Software, 2021 - Elsevier
Context: Advancements in machine learning (ML) lead to a shift from the traditional view of
software development, where algorithms are hard-coded by humans, to ML systems …
software development, where algorithms are hard-coded by humans, to ML systems …
[PDF][PDF] Calibration and correctness of language models for code
Machine learning models are widely used, but can also often be wrong. Users would benefit
from a reliable indication of whether a given output from a given model should be trusted, so …
from a reliable indication of whether a given output from a given model should be trusted, so …
Prioritizing test inputs for deep neural networks via mutation analysis
Deep Neural Network (DNN) testing is one of the most widely-used ways to guarantee the
quality of DNNs. However, labeling test inputs to check the correctness of DNN prediction is …
quality of DNNs. However, labeling test inputs to check the correctness of DNN prediction is …
Are machine learning cloud apis used correctly?
Machine learning (ML) cloud APIs enable developers to easily incorporate learning
solutions into software systems. Unfortunately, ML APIs are challenging to use correctly and …
solutions into software systems. Unfortunately, ML APIs are challenging to use correctly and …
Automated testing of software that uses machine learning apis
An increasing number of software applications incorporate machine learning (ML) solutions
for cognitive tasks that statistically mimic human behaviors. To test such software …
for cognitive tasks that statistically mimic human behaviors. To test such software …
A review and refinement of surprise adequacy
Surprise Adequacy (SA) is one of the emerging and most promising adequacy criteria for
Deep Learning (DL) testing. As an adequacy criterion, it has been used to assess the …
Deep Learning (DL) testing. As an adequacy criterion, it has been used to assess the …
Quality and Trust in LLM-generated Code
Abstract Machine learning models are widely used but can also often be wrong. Users
would benefit from a reliable indication of whether a given output from a given model should …
would benefit from a reliable indication of whether a given output from a given model should …
Can Coverage Criteria Guide Failure Discovery for Image Classifiers? An Empirical Study
Quality assurance of deep neural networks (DNNs) is crucial for the deployment of DNN-
based software, especially in mission-and safety-critical tasks. Inspired by structural white …
based software, especially in mission-and safety-critical tasks. Inspired by structural white …
Keeper: Automated Testing and Fixing of Machine Learning Software
The increasing number of software applications incorporating machine learning (ML)
solutions has led to the need for testing techniques. However, testing ML software requires …
solutions has led to the need for testing techniques. However, testing ML software requires …
Resource‐adaptive and OOD‐robust inference of deep neural networks on IoT devices
Efficiently executing inference tasks of deep neural networks on devices with limited
resources poses a significant load in IoT systems. To alleviate the load, one innovative …
resources poses a significant load in IoT systems. To alleviate the load, one innovative …