Comparative evaluation of commercial large language models on promptbench: An english and chinese perspective

S Wang, Q Ouyang, B Wang - 2024 - researchsquare.com
This study embarks on an exploration of the performance disparities observed between
English and Chinese in large language models (LLMs), motivated by the growing need for …

A comparative analysis of large language models to evaluate robustness and reliability in adversarial conditions

T Goto, K Ono, A Morita - Authorea Preprints, 2024 - techrxiv.org
This study went on a comprehensive evaluation of four prominent Large Language Models
(LLMs)-Google Gemini, Mistral 8x7B, ChatGPT-4, and Microsoft Phi-1.5-to assess their …

AIGC 大模型测评综述: 使能技术, 安全隐患和应对.

许志伟, **海龙, **博, **涛, 王嘉泰… - Journal of Frontiers …, 2024 - search.ebscohost.com
人工智能生成内容(AIGC) 模型因出色的内容生成能力, 在全球范围内引起了广泛关注与应用.
然而AIGC 大模型的快速发展也带来了一系列隐患, 例如模型生成结果的可解释性 …