- Academic Search

H Xu, Y Ma, HC Liu, D Deb, H Liu, JL Tang… - International journal of …, 2020 - Springer

Deep neural networks (DNN) have achieved unprecedented success in numerous machine
learning tasks in various domains. However, the existence of adversarial examples raises …

Speichern Zitieren Zitiert von: 820 Ähnliche Artikel Alle 13 Versionen

[Free GPT-4]

[PDF] aclanthology.org

Red teaming language models with language models

E Perez, S Huang, F Song, T Cai, R Ring… - arxiv preprint arxiv …, 2022 - arxiv.org

Language Models (LMs) often cannot be deployed because of their potential to harm users
in hard-to-predict ways. Prior work identifies harmful behaviors before deployment by using …

Speichern Zitieren Zitiert von: 595 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Universal adversarial triggers for attacking and analyzing NLP

E Wallace, S Feng, N Kandpal, M Gardner… - arxiv preprint arxiv …, 2019 - arxiv.org

Adversarial examples highlight model vulnerabilities and are useful for evaluation and
interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens …

Speichern Zitieren Zitiert von: 933 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] acm.org

Why so toxic? measuring and triggering toxic behavior in open-domain chatbots

WM Si, M Backes, J Blackburn, E De Cristofaro… - Proceedings of the …, 2022 - dl.acm.org

Chatbots are used in many applications, eg, automated agents, smart home assistants,
interactive characters in online games, etc. Therefore, it is crucial to ensure they do not …

Speichern Zitieren Zitiert von: 72 Ähnliche Artikel Alle 16 Versionen

[Free GPT-4]

[PDF] aaai.org

Hierarchical reinforcement learning for open-domain dialog

A Saleh, N Jaques, A Ghandeharioun, J Shen… - Proceedings of the AAAI …, 2020 - aaai.org

Open-domain dialog generation is a challenging problem; maximum likelihood training can
lead to repetitive outputs, models have difficulty tracking long-term conversational goals, and …

Speichern Zitieren Zitiert von: 74 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Negative training for neural dialogue response generation

T He, J Glass - arxiv preprint arxiv:1903.02134, 2019 - arxiv.org

Although deep learning models have brought tremendous advancements to the field of open-
domain dialogue response generation, recent research results have revealed that the …

Speichern Zitieren Zitiert von: 51 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

TIGS: An inference algorithm for text infilling with gradient search

D Liu, J Fu, P Liu, J Lv - arxiv preprint arxiv:1905.10752, 2019 - arxiv.org

Text infilling is defined as a task for filling in the missing part of a sentence or paragraph,
which is suitable for many real-world natural language generation scenarios. However …

Speichern Zitieren Zitiert von: 34 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Constructing highly inductive contexts for dialogue safety through controllable reverse generation

Z Zhang, J Cheng, H Sun, J Deng, F Mi, Y Wang… - arxiv preprint arxiv …, 2022 - arxiv.org

Large pretrained language models can easily produce toxic or biased content, which is
prohibitive for practical use. In order to detect such toxic generations, existing methods rely …

Speichern Zitieren Zitiert von: 11 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] star-ai.eu

XAI enhancing cyber defence against adversarial attacks in industrial applications

G Makridis, S Theodoropoulos… - 2022 IEEE 5th …, 2022 - ieeexplore.ieee.org

In recent years there is a surge of interest in the interpretability and explainability of AI
systems, which is largely motivated by the need for ensuring the transparency and …

Speichern Zitieren Zitiert von: 11 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] arxiv.org

Say what i want: Towards the dark side of neural dialogue models

H Liu, T Derr, Z Liu, J Tang - arxiv preprint arxiv:1909.06044, 2019 - arxiv.org

Neural dialogue models have been widely adopted in various chatbot applications because
of their good performance in simulating and generalizing human conversations. However …

Speichern Zitieren Zitiert von: 21 Ähnliche Artikel Alle 3 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Detecting egregious responses in neural sequence-to-sequence models

Adversarial attacks and defenses in images, graphs and text: A review

Red teaming language models with language models

Universal adversarial triggers for attacking and analyzing NLP

Why so toxic? measuring and triggering toxic behavior in open-domain chatbots

Hierarchical reinforcement learning for open-domain dialog

Negative training for neural dialogue response generation

TIGS: An inference algorithm for text infilling with gradient search

Constructing highly inductive contexts for dialogue safety through controllable reverse generation

XAI enhancing cyber defence against adversarial attacks in industrial applications

Say what i want: Towards the dark side of neural dialogue models