Google 學術搜尋

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

儲存引用被引用 2149 次相關文章全部共 4 個版本

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mm-llms: Recent advances in multimodal large language models

D Zhang, Y Yu, J Dong, C Li, D Su, C Chu… - arxiv preprint arxiv …, 2024 - arxiv.org

In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …

儲存引用被引用 211 次相關文章全部共 2 個版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

" do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models

X Shen, Z Chen, M Backes, Y Shen… - Proceedings of the 2024 on …, 2024 - dl.acm.org

The misuse of large language models (LLMs) has drawn significant attention from the
general public and LLM vendors. One particular type of adversarial prompt, known as …

儲存引用被引用 425 次相關文章全部共 2 個版本

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Simpo: Simple preference optimization with a reference-free reward

Y Meng, M **a, D Chen - Advances in Neural Information …, 2025 - proceedings.neurips.cc

Abstract Direct Preference Optimization (DPO) is a widely used offline preference
optimization algorithm that reparameterizes reward functions in reinforcement learning from …

儲存引用被引用 190 次相關文章全部共 2 個版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

儲存引用被引用 251 次相關文章全部共 4 個版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gptfuzzer: Red teaming large language models with auto-generated jailbreak prompts

J Yu, X Lin, Z Yu, X **ng - arxiv preprint arxiv:2309.10253, 2023 - arxiv.org

Large language models (LLMs) have recently experienced tremendous popularity and are
widely used from casual conversations to AI-driven programming. However, despite their …

儲存引用被引用 244 次相關文章全部共 2 個版本 HTML 版

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

K Zhu, J Wang, J Zhou, Z Wang, H Chen… - arxiv e …, 2023 - ui.adsabs.harvard.edu

The increasing reliance on Large Language Models (LLMs) across academia and industry
necessitates a comprehensive understanding of their robustness to prompts. In response to …

儲存引用被引用 242 次相關文章全部共 2 個版本

[Free GPT-4]
[DeepSeek]

[HTML] mlr.press

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press

Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

儲存引用被引用 46 次相關文章頁庫存檔

[Free GPT-4]
[DeepSeek]

[PDF] wiley.com Full View

Combating misinformation in the age of llms: Opportunities and challenges

C Chen, K Shu - AI Magazine, 2024 - Wiley Online Library

Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …

儲存引用被引用 128 次相關文章全部共 4 個版本

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, RGH Cheng… - arxiv preprint arxiv …, 2023 - arxiv.org

Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …

儲存引用被引用 282 次相關文章全部共 3 個版本 HTML 版

建立快訊

引用

進階搜尋

已儲存至「我的圖書館」

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

A survey on evaluation of large language models

Mm-llms: Recent advances in multimodal large language models

" do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models

Simpo: Simple preference optimization with a reference-free reward

Trustllm: Trustworthiness in large language models

Gptfuzzer: Red teaming large language models with auto-generated jailbreak prompts

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Combating misinformation in the age of llms: Opportunities and challenges

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment