A software engineering perspective on engineering machine learning systems: State of the art and challenges

G Giray - Journal of Systems and Software, 2021 - Elsevier
Context: Advancements in machine learning (ML) lead to a shift from the traditional view of
software development, where algorithms are hard-coded by humans, to ML systems …

Behavexplor: Behavior diversity guided testing for autonomous driving systems

M Cheng, Y Zhou, X **e - Proceedings of the 32nd ACM SIGSOFT …, 2023 - dl.acm.org
Testing Autonomous Driving Systems (ADSs) is a critical task for ensuring the reliability and
safety of autonomous vehicles. Existing methods mainly focus on searching for safety …

Cctest: Testing and repairing code completion systems

Z Li, C Wang, Z Liu, H Wang, D Chen… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Code completion, a highly valuable topic in the software development domain, has been
increasingly promoted for use by recent advances in large language models (LLMs). To …

Specification-based autonomous driving system testing

Y Zhou, Y Sun, Y Tang, Y Chen, J Sun… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Autonomous vehicle (AV) systems must be comprehensively tested and evaluated before
they can be deployed. High-fidelity simulators such as CARLA or LGSVL allow this to be …

Metamorphic testing of deep learning compilers

D **ao, Z Liu, Y Yuan, Q Pang, S Wang - Proceedings of the ACM on …, 2022 - dl.acm.org
The prosperous trend of deploying deep neural network (DNN) models to diverse hardware
platforms has boosted the development of deep learning (DL) compilers. DL compilers take …

Perception matters: Detecting perception failures of vqa models using metamorphic testing

Y Yuan, S Wang, M Jiang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Visual question answering (VQA) takes an image and a natural-language question as input
and returns a natural-language answer. To date, VQA models are primarily assessed by …

Unleashing the power of compiler intermediate representation to enhance neural program embeddings

Z Li, P Ma, H Wang, S Wang, Q Tang, S Nie… - Proceedings of the 44th …, 2022 - dl.acm.org
Neural program embeddings have demonstrated considerable promise in a range of
program analysis tasks, including clone identification, program repair, code completion, and …

Cc: Causality-aware coverage criterion for deep neural networks

Z Ji, P Ma, Y Yuan, S Wang - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
Deep neural network (DNN) testing approaches have grown fast in recent years to test the
correctness and robustness of DNNs. In particular, DNN coverage criteria are frequently …

Testing your question answering software via asking recursively

S Chen, S **, X **e - 2021 36th IEEE/ACM International …, 2021 - ieeexplore.ieee.org
Question Answering (QA) is an attractive and challenging area in NLP community. There are
diverse algorithms being proposed and various benchmark datasets with different topics and …

Automated testing of image captioning systems

B Yu, Z Zhong, X Qin, J Yao, Y Wang, P He - Proceedings of the 31st …, 2022 - dl.acm.org
Image captioning (IC) systems, which automatically generate a text description of the salient
objects in an image (real or synthetic), have seen great progress over the past few years due …