Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A survey on evaluation of large language models
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …
industry, owing to their unprecedented performance in various applications. As LLMs …
Datasets for large language models: A comprehensive survey
This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …
[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
[PDF][PDF] Trustllm: Trustworthiness in large language models
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …
attention for their excellent natural language processing capabilities. Nonetheless, these …
On the robustness of chatgpt: An adversarial and out-of-distribution perspective
ChatGPT is a recent chatbot service released by OpenAI and is receiving increasing
attention over the past few months. While evaluations of various aspects of ChatGPT have …
attention over the past few months. While evaluations of various aspects of ChatGPT have …
Pandalm: An automatic evaluation benchmark for llm instruction tuning optimization
Instruction tuning large language models (LLMs) remains a challenging task, owing to the
complexity of hyperparameter selection and the difficulty involved in evaluating the tuned …
complexity of hyperparameter selection and the difficulty involved in evaluating the tuned …
Promptbench: Towards evaluating the robustness of large language models on adversarial prompts
The increasing reliance on Large Language Models (LLMs) across academia and industry
necessitates a comprehensive understanding of their robustness to prompts. In response to …
necessitates a comprehensive understanding of their robustness to prompts. In response to …
[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models
Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …
natural language processing capabilities. Nonetheless, these LLMs present many …
Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and llms evaluations
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
Risk taxonomy, mitigation, and assessment benchmarks of large language model systems
Large language models (LLMs) have strong capabilities in solving diverse natural language
processing tasks. However, the safety and security issues of LLM systems have become the …
processing tasks. However, the safety and security issues of LLM systems have become the …