Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Context-free word importance scores for attacking neural networks
N Shakeel, S Shakeel - Journal of Computational and …, 2022 - ojs.bonviewpress.com
Leave-One-Out (LOO) scores provide estimates of feature importance in neural networks, for
adversarial attacks. In this work, we present context-free word scores as a query-efficient …
adversarial attacks. In this work, we present context-free word scores as a query-efficient …
Robust conversational agents against imperceptible toxicity triggers
Warning: this paper contains content that maybe offensive or upsetting. Recent research in
Natural Language Processing (NLP) has advanced the development of various toxicity …
Natural Language Processing (NLP) has advanced the development of various toxicity …
[PDF][PDF] I've Seen Things You Machines Wouldn't Believe: Measuring Content Predictability to Identify Automatically-Generated Text.
Modern large language models (LLMs), such as GPT-4 or ChatGPT, are capable of
producing fluent text in natural languages, making their output hard to manually differentiate …
producing fluent text in natural languages, making their output hard to manually differentiate …
Effective faking of verbal deception detection with target-aligned adversarial attacks
Background: Deception detection through analysing language is a promising avenue using
both human judgments and automated machine learning judgments. For both forms of …
both human judgments and automated machine learning judgments. For both forms of …
Identifying human strategies for generating word-level adversarial examples
Adversarial examples in NLP are receiving increasing research attention. One line of
investigation is the generation of word-level adversarial examples against fine-tuned …
investigation is the generation of word-level adversarial examples against fine-tuned …
User-centered security in natural language processing
C Emmery - arxiv preprint arxiv:2301.04230, 2023 - arxiv.org
This dissertation proposes a framework of user-centered security in Natural Language
Processing (NLP), and demonstrates how it can improve the accessibility of related …
Processing (NLP), and demonstrates how it can improve the accessibility of related …
Towards stronger adversarial baselines through human-AI collaboration
Natural language processing (NLP) systems are often used for adversarial tasks such as
detecting spam, abuse, hate speech, and fake news. Properly evaluating such systems …
detecting spam, abuse, hate speech, and fake news. Properly evaluating such systems …
Evaluating Mitigation Approaches for Adversarial Attacks in Crowdwork
CG Harris - 2023 IEEE International Conference on Big Data …, 2023 - ieeexplore.ieee.org
Crowdsourcing has emerged as a collaborative method to accomplish various tasks using
open calls posted on platforms such as Amazon Mechanical Turk. Most requesters who post …
open calls posted on platforms such as Amazon Mechanical Turk. Most requesters who post …
Ethical and Technological AI Risks Classification: A Human Vs Machine Approach
The growing use of data-driven decision systems based on Artificial Intelligence (AI) by
governments, companies and social organizations has given more attention to the …
governments, companies and social organizations has given more attention to the …
Understanding and Guarding against Natural Language Adversarial Examples
MAJ Mozes - 2024 - discovery.ucl.ac.uk
Despite their success, machine learning models have been shown to be susceptible to
adversarial examples: carefully constructed perturbations of model inputs that are intended …
adversarial examples: carefully constructed perturbations of model inputs that are intended …