Natural language reasoning, a survey
This survey article proposes a clearer view of Natural Language Reasoning (NLR) in the
field of Natural Language Processing (NLP), both conceptually and practically …
field of Natural Language Processing (NLP), both conceptually and practically …
Ai alignment: A comprehensive survey
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …
[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
From pretraining data to language models to downstream tasks: Tracking the trails of political biases leading to unfair NLP models
Language models (LMs) are pretrained on diverse data sources, including news, discussion
forums, books, and online encyclopedias. A significant portion of this data includes opinions …
forums, books, and online encyclopedias. A significant portion of this data includes opinions …
Evaluating the moral beliefs encoded in llms
This paper presents a case study on the design, administration, post-processing, and
evaluation of surveys on large language models (LLMs). It comprises two components:(1) A …
evaluation of surveys on large language models (LLMs). It comprises two components:(1) A …
Refiner: Reasoning feedback on intermediate representations
Language models (LMs) have recently shown remarkable performance on reasoning tasks
by explicitly generating intermediate inferences, eg, chain-of-thought prompting. However …
by explicitly generating intermediate inferences, eg, chain-of-thought prompting. However …
When to make exceptions: Exploring language models as accounts of human moral judgment
AI systems are becoming increasingly intertwined with human life. In order to effectively
collaborate with humans and ensure safety, AI systems need to be able to understand …
collaborate with humans and ensure safety, AI systems need to be able to understand …
Moca: Measuring human-language model alignment on causal and moral judgment tasks
Human commonsense understanding of the physical and social world is organized around
intuitive theories. These theories support making causal and moral judgments. When …
intuitive theories. These theories support making causal and moral judgments. When …
Safetybench: Evaluating the safety of large language models with multiple choice questions
With the rapid development of Large Language Models (LLMs), increasing attention has
been paid to their safety concerns. Consequently, evaluating the safety of LLMs has become …
been paid to their safety concerns. Consequently, evaluating the safety of LLMs has become …
The moral integrity corpus: A benchmark for ethical dialogue systems
Conversational agents have come increasingly closer to human competence in open-
domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely …
domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely …