Google 학술 검색

BC Das, MH Amini, Y Wu - ACM Computing Surveys, 2024 - dl.acm.org

Large language models (LLMs) have demonstrated extraordinary capabilities and
contributed to multiple fields, such as generating and summarizing text, language …

저장 인용 97회 인용 관련 학술자료 전체 3개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Jailbreakzoo: Survey, landscapes, and horizons in jailbreaking large language and vision-language models

H **, L Hu, X Li, P Zhang, C Chen, J Zhuang… - ar** a robust safety
mechanism, colloquially known as" safeguards" or" guardrails", has become imperative to …

저장 인용 13회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Protecting your llms with information bottleneck

Z Liu, Z Wang, L Xu, J Wang, L Song, T Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

The advent of large language models (LLMs) has revolutionized the field of natural
language processing, yet they might be attacked to produce harmful content. Despite efforts …

저장 인용 10회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

A Rawat, S Schoepf, G Zizzo, G Cornacchia… - arxiv preprint arxiv …, 2024 - arxiv.org

As generative AI, particularly large language models (LLMs), become increasingly
integrated into production applications, new attack surfaces and vulnerabilities emerge and …

저장 인용 3회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improved techniques for optimization-based jailbreaking on large language models

X Jia, T Pang, C Du, Y Huang, J Gu, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) are being rapidly developed, and a key component of their
widespread deployment is their safety-related alignment. Many red-teaming efforts aim to …

저장 인용 19회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Blackdan: A black-box multi-objective approach for effective and contextual jailbreaking of large language models

X Wang, VSJ Huang, R Chen, H Wang, C Pan… - arxiv preprint arxiv …, 2024 - arxiv.org

While large language models (LLMs) exhibit remarkable capabilities across various tasks,
they encounter potential security risks such as jailbreak attacks, which exploit vulnerabilities …

저장 인용 2회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

인용

고급 검색

라이브러리에 저장됨

Security and privacy challenges of large language models: A survey

Jailbreakzoo: Survey, landscapes, and horizons in jailbreaking large language and vision-language models

Protecting your llms with information bottleneck

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

Improved techniques for optimization-based jailbreaking on large language models

Blackdan: A black-box multi-objective approach for effective and contextual jailbreaking of large language models