- Academic Search

F Yu, H Zhang, P Tiwari, B Wang - ACM Computing Surveys, 2024 - dl.acm.org

This survey article proposes a clearer view of Natural Language Reasoning (NLR) in the
field of Natural Language Processing (NLP), both conceptually and practically …

Speichern Zitieren Zitiert von: 66 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] arxiv.org

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Speichern Zitieren Zitiert von: 222 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] qub.ac.uk

[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

B Wang, W Chen, H Pei, C **e, M Kang, C Zhang, C Xu… - NeurIPS, 2023 - blogs.qub.ac.uk

Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …

Speichern Zitieren Zitiert von: 375 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

Speichern Zitieren Zitiert von: 243 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[HTML] mlr.press

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press

Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

Speichern Zitieren Zitiert von: 39 Ähnliche Artikel Im Cache

[Free GPT-4]

[PDF] neurips.cc

Evaluating the moral beliefs encoded in llms

N Scherrer, C Shi, A Feder… - Advances in Neural …, 2024 - proceedings.neurips.cc

This paper presents a case study on the design, administration, post-processing, and
evaluation of surveys on large language models (LLMs). It comprises two components:(1) A …

Speichern Zitieren Zitiert von: 96 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Large pre-trained language models contain human-like biases of what is right and wrong to do

P Schramowski, C Turan, N Andersen… - Nature Machine …, 2022 - nature.com

Artificial writing is permeating our lives due to recent advances in large-scale, transformer-
based language models (LMs) such as BERT, GPT-2 and GPT-3. Using them as pre-trained …

Speichern Zitieren Zitiert von: 317 Ähnliche Artikel Alle 10 Versionen

[Free GPT-4]

[PDF] neurips.cc

When to make exceptions: Exploring language models as accounts of human moral judgment

Z **, S Levine, F Gonzalez Adauto… - Advances in neural …, 2022 - proceedings.neurips.cc

AI systems are becoming increasingly intertwined with human life. In order to effectively
collaborate with humans and ensure safety, AI systems need to be able to understand …

Speichern Zitieren Zitiert von: 97 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Latent hatred: A benchmark for understanding implicit hate speech

M ElSherief, C Ziems, D Muchlinski, V Anupindi… - arxiv preprint arxiv …, 2021 - arxiv.org

Hate speech has grown significantly on social media, causing serious consequences for
victims of all demographics. Despite much attention being paid to characterize and detect …

Speichern Zitieren Zitiert von: 211 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

NLPositionality: Characterizing design biases of datasets and models

S Santy, JT Liang, RL Bras, K Reinecke… - arxiv preprint arxiv …, 2023 - arxiv.org

Design biases in NLP systems, such as performance differences for different populations,
often stem from their creator's positionality, ie, views and lived experiences shaped by …

Speichern Zitieren Zitiert von: 73 Ähnliche Artikel Alle 9 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Social chemistry 101: Learning to reason about social and moral norms

Natural language reasoning, a survey

Ai alignment: A comprehensive survey

[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

Trustllm: Trustworthiness in large language models

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Evaluating the moral beliefs encoded in llms

Large pre-trained language models contain human-like biases of what is right and wrong to do

When to make exceptions: Exploring language models as accounts of human moral judgment

Latent hatred: A benchmark for understanding implicit hate speech

NLPositionality: Characterizing design biases of datasets and models