Google Akademik

S Sicari, JF Cevallos M, A Rizzardi… - ACM Computing …, 2024 - dl.acm.org

This survey summarises the most recent methods for building and assessing helpful, honest,
and harmless neural language models, considering small, medium, and large-size models …

Kaydet Alıntı yap Alıntılanma sayısı: 1 İlgili makaleler 4 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Generative language models exhibit social identity biases

T Hu, Y Kyrychenko, S Rathje, N Collier… - Nature Computational …, 2025 - nature.com

Social identity biases, particularly the tendency to favor one's own group (ingroup solidarity)
and derogate other groups (outgroup hostility), are deeply rooted in human psychology and …

Kaydet Alıntı yap Alıntılanma sayısı: 20 İlgili makaleler 2 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback

Y Peng, AD Gotmare, M Lyu, C **ong… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) are widely adopted for assisting in software development
tasks, yet their performance evaluations have narrowly focused on the functional correctness …

Kaydet Alıntı yap Alıntılanma sayısı: 3 İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Teaching models to balance resisting and accepting persuasion

E Stengel-Eskin, P Hase, M Bansal - arxiv preprint arxiv:2410.14596, 2024 - arxiv.org

Large language models (LLMs) are susceptible to persuasion, which can pose risks when
models are faced with an adversarial interlocutor. We take a first step towards defending …

Kaydet Alıntı yap Alıntılanma sayısı: 3 İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Claude 2.0 large language model: Tackling a real-world classification problem with a new iterative prompt engineering approach

L Caruccio, S Cirillo, G Polese, G Solimando… - Intelligent Systems with …, 2024 - Elsevier

In the last year, Large Language Models (LLMs) have transformed the way of tackling
problems, opening up new perspectives in various works and research fields, due to their …

Kaydet Alıntı yap Alıntılanma sayısı: 23 İlgili makaleler 2 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Antagonistic AI

A Cai, I Arawjo, EL Glassman - arxiv preprint arxiv:2402.07350, 2024 - arxiv.org

The vast majority of discourse around AI development assumes that subservient," moral"
models aligned with" human values" are universally beneficial--in short, that good AI is …

Kaydet Alıntı yap Alıntılanma sayısı: 9 İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Sycophancy in Large Language Models: Causes and Mitigations

L Malmqvist - arxiv preprint arxiv:2411.15287, 2024 - arxiv.org

Large language models (LLMs) have demonstrated remarkable capabilities across a wide
range of natural language processing tasks. However, their tendency to exhibit sycophantic …

Kaydet Alıntı yap Alıntılanma sayısı: 2 İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies

SSY Kim, JW Vaughan, QV Liao, T Lombrozo… - arxiv preprint arxiv …, 2025 - arxiv.org

Large language models (LLMs) can produce erroneous responses that sound fluent and
convincing, raising the risk that users will rely on these responses as if they were correct …

Kaydet Alıntı yap İlgili makaleler HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Prompt Leakage effect and mitigation strategies for multi-turn LLM Applications

D Agarwal, AR Fabbri, B Risher, P Laban… - Proceedings of the …, 2024 - aclanthology.org

Prompt leakage poses a compelling security and privacy threat in LLM applications.
Leakage of system prompts may compromise intellectual property, and act as adversarial …

Kaydet Alıntı yap Alıntılanma sayısı: 1 İlgili makaleler HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Understanding the Effects of Iterative Prompting on Truthfulness

S Krishna, C Agarwal, H Lakkaraju - arxiv preprint arxiv:2402.06625, 2024 - arxiv.org

The development of Large Language Models (LLMs) has notably transformed numerous
sectors, offering impressive text generation capabilities. Yet, the reliability and truthfulness of …

Kaydet Alıntı yap Alıntılanma sayısı: 7 İlgili makaleler 3 sürümün hepsi HTML olarak görüntüle

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

Are you sure? challenging llms leads to performance drops in the flipflop experiment

Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models

Generative language models exhibit social identity biases

PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback

Teaching models to balance resisting and accepting persuasion

[HTML][HTML] Claude 2.0 large language model: Tackling a real-world classification problem with a new iterative prompt engineering approach

Antagonistic AI

Sycophancy in Large Language Models: Causes and Mitigations

Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies

Prompt Leakage effect and mitigation strategies for multi-turn LLM Applications

Understanding the Effects of Iterative Prompting on Truthfulness