Glitch tokens in large language models: Categorization taxonomy and effective detection
With the expanding application of Large Language Models (LLMs) in various domains, it
becomes imperative to comprehensively investigate their unforeseen behaviors and …
becomes imperative to comprehensively investigate their unforeseen behaviors and …
Cctest: Testing and repairing code completion systems
Code completion, a highly valuable topic in the software development domain, has been
increasingly promoted for use by recent advances in large language models (LLMs). To …
increasingly promoted for use by recent advances in large language models (LLMs). To …
Improving machine translation systems via isotopic replacement
Machine translation plays an essential role in people's daily international communication.
However, machine translation systems are far from perfect. To tackle this problem …
However, machine translation systems are far from perfect. To tackle this problem …
Natural test generation for precise testing of question answering software
Question answering (QA) software uses information retrieval and natural language
processing techniques to automatically answer questions posed by humans in a natural …
processing techniques to automatically answer questions posed by humans in a natural …
Metamorphic testing of deep learning compilers
The prosperous trend of deploying deep neural network (DNN) models to diverse hardware
platforms has boosted the development of deep learning (DL) compilers. DL compilers take …
platforms has boosted the development of deep learning (DL) compilers. DL compilers take …
Mttm: Metamorphic testing for textual content moderation software
The exponential growth of social media platforms such as Twitter and Facebook has
revolutionized textual communication and textual content publication in human society …
revolutionized textual communication and textual content publication in human society …
A survey on large language models for software engineering
Software Engineering (SE) is the systematic design, development, and maintenance of
software applications, underpinning the digital infrastructure of our modern mainworld. Very …
software applications, underpinning the digital infrastructure of our modern mainworld. Very …
Testing your question answering software via asking recursively
Question Answering (QA) is an attractive and challenging area in NLP community. There are
diverse algorithms being proposed and various benchmark datasets with different topics and …
diverse algorithms being proposed and various benchmark datasets with different topics and …
An image is worth a thousand toxic words: A metamorphic testing framework for content moderation software
The exponential growth of social media platforms has brought about a revolution in
communication and content dissemination in human society. Nevertheless, these platforms …
communication and content dissemination in human society. Nevertheless, these platforms …
Nmtsloth: understanding and testing efficiency degradation of neural machine translation systems
Neural Machine Translation (NMT) systems have received much recent attention due to their
human-level accuracy. While existing works mostly focus on either improving accuracy or …
human-level accuracy. While existing works mostly focus on either improving accuracy or …