Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Explainable generative ai (genxai): A survey, conceptualization, and research agenda
J Schneider - Artificial Intelligence Review, 2024 - Springer
Generative AI (GenAI) represents a shift from AI's ability to “recognize” to its ability to
“generate” solutions for a wide range of tasks. As generated solutions and applications grow …
“generate” solutions for a wide range of tasks. As generated solutions and applications grow …
Identifying and mitigating vulnerabilities in llm-integrated applications
F Jiang - 2024 - search.proquest.com
Large language models (LLMs) are increasingly deployed as the backend for various
applications, including code completion tools and AI-powered search engines. Unlike …
applications, including code completion tools and AI-powered search engines. Unlike …
Explainability for large language models: A survey
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …
language processing. However, their internal mechanisms are still unclear and this lack of …
[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
Robust prompt optimization for defending language models against jailbreaking attacks
Despite advances in AI alignment, large language models (LLMs) remain vulnerable to
adversarial attacks or jailbreaking, in which adversaries can modify prompts to induce …
adversarial attacks or jailbreaking, in which adversaries can modify prompts to induce …
Exploring the limits of domain-adaptive training for detoxifying large-scale language models
Pre-trained language models (LMs) are shown to easily generate toxic language. In this
work, we systematically explore domain-adaptive training to reduce the toxicity of language …
work, we systematically explore domain-adaptive training to reduce the toxicity of language …
An llm can fool itself: A prompt-based adversarial attack
The wide-ranging applications of large language models (LLMs), especially in safety-critical
domains, necessitate the proper evaluation of the LLM's adversarial robustness. This paper …
domains, necessitate the proper evaluation of the LLM's adversarial robustness. This paper …
Exposing the Achilles' heel of textual hate speech classifiers using indistinguishable adversarial examples
The accessibility of online hate speech has increased significantly, making it crucial for
social-media companies to prioritize efforts to curb its spread. Although deep learning …
social-media companies to prioritize efforts to curb its spread. Although deep learning …
Transferable adversarial distribution learning: Query-efficient adversarial attack against large language models
It is a challenging task to fool a text classifier based on deep neural networks under the
black-box setting where the target model can only be queried. Among the existing black-box …
black-box setting where the target model can only be queried. Among the existing black-box …
Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges
P Kumar - International Journal of Multimedia Information …, 2024 - Springer
Large language models (LLMs) have exhibited remarkable efficacy and proficiency in a
wide array of NLP endeavors. Nevertheless, concerns are growing rapidly regarding the …
wide array of NLP endeavors. Nevertheless, concerns are growing rapidly regarding the …