- Academic Search

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Salva Cita Citato da 487 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, so do risks from misalignment. To provide a comprehensive …

Salva Cita Citato da 243 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

L Huang, W Yu, W Ma, W Zhong, Z Feng… - ACM Transactions on …, 2025 - dl.acm.org

The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …

Salva Cita Citato da 939 Articoli correlati Tutte e 4 le versioni

Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arxiv preprint arxiv …, 2023 - arxiv.org

While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

Salva Cita Citato da 974 Articoli correlati Tutte e 2 le versioni Copia cache

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arxiv preprint arxiv …, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

Salva Cita Citato da 469 Articoli correlati Tutte e 7 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Ablating concepts in text-to-image diffusion models

N Kumari, B Zhang, SY Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale text-to-image diffusion models can generate high-fidelity images with powerful
compositional ability. However, these models are typically trained on an enormous amount …

Salva Cita Citato da 177 Articoli correlati Tutte e 6 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] utk.edu

[PDF][PDF] Trustllm: Trustworthiness in large language models

L Sun, Y Huang, H Wang, S Wu, Q Zhang… - arxiv preprint arxiv …, 2024 - mosis.eecs.utk.edu

Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

Salva Cita Citato da 255 Articoli correlati Tutte e 6 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Editing large language models: Problems, methods, and opportunities

Y Yao, P Wang, B Tian, S Cheng, Z Li, S Deng… - arxiv preprint arxiv …, 2023 - arxiv.org

Despite the ability to train capable LLMs, the methodology for maintaining their relevancy
and rectifying errors remains elusive. To this end, the past few years have witnessed a surge …

Salva Cita Citato da 248 Articoli correlati Tutte e 8 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Language models represent space and time

W Gurnee, M Tegmark - arxiv preprint arxiv:2310.02207, 2023 - arxiv.org

The capabilities of large language models (LLMs) have sparked debate over whether such
systems just learn an enormous collection of superficial statistics or a set of more coherent …

Salva Cita Citato da 189 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Unified concept editing in diffusion models

R Gandikota, H Orgad, Y Belinkov… - Proceedings of the …, 2024 - openaccess.thecvf.com

Text-to-image models suffer from various safety issues that may limit their suitability for
deployment. Previous methods have separately addressed individual issues of bias …

Salva Cita Citato da 144 Articoli correlati Tutte e 7 le versioni Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Mass-editing memory in a transformer

Challenges and applications of large language models

Ai alignment: A comprehensive survey

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

Siren's song in the AI ocean: a survey on hallucination in large language models

Open problems and fundamental limitations of reinforcement learning from human feedback

Ablating concepts in text-to-image diffusion models

[PDF][PDF] Trustllm: Trustworthiness in large language models

Editing large language models: Problems, methods, and opportunities

Language models represent space and time

Unified concept editing in diffusion models