- Academic Search

N Shakeel, S Shakeel - Journal of Computational and …, 2022 - ojs.bonviewpress.com

Leave-One-Out (LOO) scores provide estimates of feature importance in neural networks, for
adversarial attacks. In this work, we present context-free word scores as a query-efficient …

Lưu Trích dẫn Trích dẫn 81 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Robust conversational agents against imperceptible toxicity triggers

N Mehrabi, A Beirami, F Morstatter… - arxiv preprint arxiv …, 2022 - arxiv.org

Warning: this paper contains content that maybe offensive or upsetting. Recent research in
Natural Language Processing (NLP) has advanced the development of various toxicity …

Lưu Trích dẫn Trích dẫn 31 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

[PDF][PDF] I've Seen Things You Machines Wouldn't Believe: Measuring Content Predictability to Identify Automatically-Generated Text.

P Przybyla, N Duran-Silva, SE Gómez - IberLEF@ SEPLN, 2023 - researchgate.net

Modern large language models (LLMs), such as GPT-4 or ChatGPT, are capable of
producing fluent text in natural languages, making their output hard to manually differentiate …

Lưu Trích dẫn Trích dẫn 14 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Effective faking of verbal deception detection with target-aligned adversarial attacks

B Kleinberg, R Loconte, B Verschuere - arxiv preprint arxiv:2501.05962, 2025 - arxiv.org

Background: Deception detection through analysing language is a promising avenue using
both human judgments and automated machine learning judgments. For both forms of …

Lưu Trích dẫn Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Identifying human strategies for generating word-level adversarial examples

M Mozes, B Kleinberg, LD Griffin - arxiv preprint arxiv:2210.11598, 2022 - arxiv.org

Adversarial examples in NLP are receiving increasing research attention. One line of
investigation is the generation of word-level adversarial examples against fine-tuned …

Lưu Trích dẫn Trích dẫn 3 bài viết Bài viết có liên quan Tất cả 5 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

User-centered security in natural language processing

C Emmery - arxiv preprint arxiv:2301.04230, 2023 - arxiv.org

This dissertation proposes a framework of user-centered security in Natural Language
Processing (NLP), and demonstrates how it can improve the accessibility of related …

Lưu Trích dẫn Trích dẫn 1 bài viết Bài viết có liên quan Tất cả 3 phiên bản Tìm kiếm Thư viện Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Towards stronger adversarial baselines through human-AI collaboration

W You, D Lowd - Proceedings of NLP Power! The First Workshop …, 2022 - aclanthology.org

Natural language processing (NLP) systems are often used for adversarial tasks such as
detecting spam, abuse, hate speech, and fake news. Properly evaluating such systems …

Lưu Trích dẫn Trích dẫn 3 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

Evaluating Mitigation Approaches for Adversarial Attacks in Crowdwork

CG Harris - 2023 IEEE International Conference on Big Data …, 2023 - ieeexplore.ieee.org

Crowdsourcing has emerged as a collaborative method to accomplish various tasks using
open calls posted on platforms such as Amazon Mechanical Turk. Most requesters who post …

Lưu Trích dẫn Trích dẫn 1 bài viết Bài viết có liên quan Tất cả 2 phiên bản

Ethical and Technological AI Risks Classification: A Human Vs Machine Approach

S Teixeira, B Veloso, JC Rodrigues, J Gama - Joint European Conference …, 2022 - Springer

The growing use of data-driven decision systems based on Artificial Intelligence (AI) by
governments, companies and social organizations has given more attention to the …

Lưu Trích dẫn Trích dẫn 1 bài viết Bài viết có liên quan

[Free GPT-4]
[DeepSeek]

[PDF] ucl.ac.uk

Understanding and Guarding against Natural Language Adversarial Examples

MAJ Mozes - 2024 - discovery.ucl.ac.uk

Despite their success, machine learning models have been shown to be susceptible to
adversarial examples: carefully constructed perturbations of model inputs that are intended …

Lưu Trích dẫn Bài viết có liên quan Xem dạng HTML

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Contrasting human-and machine-generated word-level adversarial examples for text classification

Context-free word importance scores for attacking neural networks

Robust conversational agents against imperceptible toxicity triggers

[PDF][PDF] I've Seen Things You Machines Wouldn't Believe: Measuring Content Predictability to Identify Automatically-Generated Text.

Effective faking of verbal deception detection with target-aligned adversarial attacks

Identifying human strategies for generating word-level adversarial examples

User-centered security in natural language processing

Towards stronger adversarial baselines through human-AI collaboration

Evaluating Mitigation Approaches for Adversarial Attacks in Crowdwork

Ethical and Technological AI Risks Classification: A Human Vs Machine Approach

Understanding and Guarding against Natural Language Adversarial Examples