Preference tuning with human feedback on language, speech, and vision tasks: A survey

GI Winata, H Zhao, A Das, W Tang, DD Yao… - arxiv preprint arxiv …, 2024 - arxiv.org
Preference tuning is a crucial process for aligning deep generative models with human
preferences. This survey offers a thorough overview of recent advancements in preference …

Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback

J Ji, J Zhou, H Lou, B Chen, D Hong, X Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Reinforcement learning from human feedback (RLHF) has proven effective in enhancing the
instruction-following capabilities of large language models; however, it remains …

Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach

S Deng, W Zhao, YJ Li, K Wan, D Miranda… - arxiv preprint arxiv …, 2024 - arxiv.org
Self-improvement in multimodal large language models (MLLMs) is crucial for enhancing
their reliability and robustness. However, current methods often rely heavily on MLLMs …

Modality-fair preference optimization for trustworthy mllm alignment

S Jiang, Y Zhang, R Chen, Y **, Z Liu - arxiv preprint arxiv:2410.15334, 2024 - arxiv.org
Direct Preference Optimization (DPO) is effective for aligning large language models (LLMs),
but when applied to multimodal models (MLLMs), it often favors text over image information …

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal

Y Wang, Z Zhu, H Liu, Y Liao, H Liu, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal large language models (MLLMs) excel at multimodal perception and
understanding, yet their tendency to generate hallucinated or inaccurate responses …

Efficient and Scalable Large Multimodal Models

S Shen - 2024 - search.proquest.com
The rapid advancement of large multimodal models (LMMs) has revolutionized the field of
deep learning, enabling sophisticated understanding and generation across various …

Controlling Multimodal LLMs via Reward-guided Decoding

O Mañas, P D'Oro, K Sinha, A Romero-Soriano… - … Models: Evolving AI for … - openreview.net
As Multimodal Large Language Models (MLLMs) gain widespread applicability, it is
becoming increasingly desirable to adapt them for diverse user needs. In this paper, we …