Google Académico

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM transactions on …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

Guardar Citar Citado por 2238 Artículos relacionados Las 8 versiones

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Text data augmentation for deep learning

C Shorten, TM Khoshgoftaar, B Furht - Journal of big Data, 2021 - Springer

Abstract Natural Language Processing (NLP) is one of the most captivating applications of
Deep Learning. In this survey, we consider how the Data Augmentation training strategy can …

Guardar Citar Citado por 580 Artículos relacionados Las 15 versiones

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

" do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models

X Shen, Z Chen, M Backes, Y Shen… - Proceedings of the 2024 on …, 2024 - dl.acm.org

The misuse of large language models (LLMs) has drawn significant attention from the
general public and LLM vendors. One particular type of adversarial prompt, known as …

Guardar Citar Citado por 437 Artículos relacionados Las 5 versiones

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Jailbreaking black box large language models in twenty queries

P Chao, A Robey, E Dobriban, H Hassani… - arxiv preprint arxiv …, 2023 - arxiv.org

There is growing interest in ensuring that large language models (LLMs) align with human
values. However, the alignment of such models is vulnerable to adversarial jailbreaks, which …

Guardar Citar Citado por 476 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] qub.ac.uk

[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

B Wang, W Chen, H Pei, C **e, M Kang, C Zhang, C Xu… - NeurIPS, 2023 - blogs.qub.ac.uk

Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …

Guardar Citar Citado por 409 Artículos relacionados Las 9 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Should chatgpt be biased? challenges and risks of bias in large language models

E Ferrara - arxiv preprint arxiv:2304.03738, 2023 - arxiv.org

As the capabilities of generative language models continue to advance, the implications of
biases ingrained within these models have garnered increasing attention from researchers …

Guardar Citar Citado por 407 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Tree of attacks: Jailbreaking black-box llms automatically

A Mehrotra, M Zampetakis… - Advances in …, 2025 - proceedings.neurips.cc

Abstract While Large Language Models (LLMs) display versatile functionality, they continue
to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human …

Guardar Citar Citado por 182 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arxiv preprint arxiv …, 2022 - arxiv.org

Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

Guardar Citar Citado por 1198 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned

D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai… - arxiv preprint arxiv …, 2022 - arxiv.org

We describe our early efforts to red team language models in order to simultaneously
discover, measure, and attempt to reduce their potentially harmful outputs. We make three …

Guardar Citar Citado por 489 Artículos relacionados Las 3 versiones Versión en HTML

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

K Zhu, J Wang, J Zhou, Z Wang, H Chen… - arxiv e …, 2023 - ui.adsabs.harvard.edu

The increasing reliance on Large Language Models (LLMs) across academia and industry
necessitates a comprehensive understanding of their robustness to prompts. In response to …

Guardar Citar Citado por 249 Artículos relacionados

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

A survey on evaluation of large language models

Text data augmentation for deep learning

" do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models

Jailbreaking black box large language models in twenty queries

[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

Should chatgpt be biased? challenges and risks of bias in large language models

Tree of attacks: Jailbreaking black-box llms automatically

Holistic evaluation of language models

Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts