Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Survey of vulnerabilities in large language models revealed by adversarial attacks
Large Language Models (LLMs) are swiftly advancing in architecture and capability, and as
they integrate more deeply into complex systems, the urgency to scrutinize their security …
they integrate more deeply into complex systems, the urgency to scrutinize their security …
Explainable ai: A review of machine learning interpretability methods
Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption,
with machine learning systems demonstrating superhuman performance in a significant …
with machine learning systems demonstrating superhuman performance in a significant …
" do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models
The misuse of large language models (LLMs) has drawn significant attention from the
general public and LLM vendors. One particular type of adversarial prompt, known as …
general public and LLM vendors. One particular type of adversarial prompt, known as …
[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense
The rise in malicious usage of large language models, such as fake content creation and
academic plagiarism, has motivated the development of approaches that identify AI …
academic plagiarism, has motivated the development of approaches that identify AI …
Harmbench: A standardized evaluation framework for automated red teaming and robust refusal
Automated red teaming holds substantial promise for uncovering and mitigating the risks
associated with the malicious use of large language models (LLMs), yet the field lacks a …
associated with the malicious use of large language models (LLMs), yet the field lacks a …
A survey of data augmentation approaches for NLP
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …
resource domains, new tasks, and the popularity of large-scale neural networks that require …
In chatgpt we trust? measuring and characterizing the reliability of chatgpt
The way users acquire information is undergoing a paradigm shift with the advent of
ChatGPT. Unlike conventional search engines, ChatGPT retrieves knowledge from the …
ChatGPT. Unlike conventional search engines, ChatGPT retrieves knowledge from the …
A survey of safety and trustworthiness of large language models through the lens of verification and validation
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …
engage end-users in human-level conversations with detailed and articulate answers across …
A survey of adversarial defenses and robustness in nlp
In the past few years, it has become increasingly evident that deep neural networks are not
resilient enough to withstand adversarial perturbations in input data, leaving them …
resilient enough to withstand adversarial perturbations in input data, leaving them …