- Academic Search

B Gao, F Song, Y Miao, Z Cai, Z Yang, L Chen… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial
factors to achieve success is aligning the LLM's output with human preferences. This …

保存引用被引用次数：6 相关文章 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Treebon: Enhancing inference-time alignment with speculative tree-search and best-of-n sampling

J Qiu, Y Lu, Y Zeng, J Guo, J Geng, H Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Inference-time alignment enhances the performance of large language models without
requiring additional training or fine-tuning but presents challenges due to balancing …

保存引用被引用次数：5 相关文章 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Cascade reward sampling for efficient decoding-time alignment

B Li, Y Wang, A Grama, R Zhang - arxiv preprint arxiv:2406.16306, 2024 - arxiv.org

Aligning large language models (LLMs) with human preferences is critical for their
deployment. Recently, decoding-time alignment has emerged as an effective plug-and-play …

保存引用被引用次数：5 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Inference-time language model alignment via integrated value guidance

Z Liu, Z Zhou, Y Wang, C Yang, Y Qiao - arxiv preprint arxiv:2409.17819, 2024 - arxiv.org

Large language models are typically fine-tuned to align with human preferences, but tuning
large models is computationally intensive and complex. In this work, we introduce $\textit …

保存引用被引用次数：3 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Towards building specialized generalist ai with system 1 and system 2 fusion

K Zhang, B Qi, B Zhou - arxiv preprint arxiv:2407.08642, 2024 - arxiv.org

In this perspective paper, we introduce the concept of Specialized Generalist Artificial
Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence …

保存引用被引用次数：4 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

A Rawat, S Schoepf, G Zizzo, G Cornacchia… - arxiv preprint arxiv …, 2024 - arxiv.org

As generative AI, particularly large language models (LLMs), become increasingly
integrated into production applications, new attack surfaces and vulnerabilities emerge and …

保存引用被引用次数：3 相关文章所有 3 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Decoding-time Realignment of Language Models

T Liu, S Guo, L Bianco, D Calandriello… - arxiv preprint arxiv …, 2024 - arxiv.org

Aligning language models with human preferences is crucial for reducing errors and biases
in these models. Alignment techniques, such as reinforcement learning from human …

保存引用被引用次数：16 相关文章所有 3 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Multi-Objective Alignment of Large Language Models Through Hypervolume Maximization

S Mukherjee, A Lalitha, S Sengupta… - arxiv preprint arxiv …, 2024 - arxiv.org

Multi-objective alignment from human feedback (MOAHF) in large language models (LLMs)
is a challenging problem as human preferences are complex, multifaceted, and often …

保存引用被引用次数：1 相关文章 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Towards Inference-time Category-wise Safety Steering for Large Language Models

A Bhattacharjee, S Ghosh, T Rebedea… - arxiv preprint arxiv …, 2024 - arxiv.org

While large language models (LLMs) have seen unprecedented advancements in
capabilities and applications across a variety of use-cases, safety alignment of these models …

保存引用被引用次数：1 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

A Moral Imperative: The Need for Continual Superalignment of Large Language Models

G Puthumanaillam, M Vora, P Thangeda… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper examines the challenges associated with achieving life-long superalignment in
AI systems, particularly large language models (LLMs). Superalignment is a theoretical …

保存引用被引用次数：10 相关文章所有 2 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Deal: Decoding-time alignment for large language models

Towards a unified view of preference learning for large language models: A survey

Treebon: Enhancing inference-time alignment with speculative tree-search and best-of-n sampling

Cascade reward sampling for efficient decoding-time alignment

Inference-time language model alignment via integrated value guidance

Towards building specialized generalist ai with system 1 and system 2 fusion

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

Decoding-time Realignment of Language Models

Multi-Objective Alignment of Large Language Models Through Hypervolume Maximization

Towards Inference-time Category-wise Safety Steering for Large Language Models

A Moral Imperative: The Need for Continual Superalignment of Large Language Models