A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

Mm-llms: Recent advances in multimodal large language models

D Zhang, Y Yu, J Dong, C Li, D Su, C Chu… - arxiv preprint arxiv …, 2024 - arxiv.org
In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …

" do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models

X Shen, Z Chen, M Backes, Y Shen… - Proceedings of the 2024 on …, 2024 - dl.acm.org
The misuse of large language models (LLMs) has drawn significant attention from the
general public and LLM vendors. One particular type of adversarial prompt, known as …

Simpo: Simple preference optimization with a reference-free reward

Y Meng, M **a, D Chen - Advances in Neural Information …, 2025 - proceedings.neurips.cc
Abstract Direct Preference Optimization (DPO) is a widely used offline preference
optimization algorithm that reparameterizes reward functions in reinforcement learning from …

Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

Gptfuzzer: Red teaming large language models with auto-generated jailbreak prompts

J Yu, X Lin, Z Yu, X **ng - arxiv preprint arxiv:2309.10253, 2023 - arxiv.org
Large language models (LLMs) have recently experienced tremendous popularity and are
widely used from casual conversations to AI-driven programming. However, despite their …

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

K Zhu, J Wang, J Zhou, Z Wang, H Chen… - arxiv e …, 2023 - ui.adsabs.harvard.edu
The increasing reliance on Large Language Models (LLMs) across academia and industry
necessitates a comprehensive understanding of their robustness to prompts. In response to …

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press
Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

Combating misinformation in the age of llms: Opportunities and challenges

C Chen, K Shu - AI Magazine, 2024 - Wiley Online Library
Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, RGH Cheng… - arxiv preprint arxiv …, 2023 - arxiv.org
Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …