Preference tuning with human feedback on language, speech, and vision tasks: A survey
Preference tuning is a crucial process for aligning deep generative models with human
preferences. This survey offers a thorough overview of recent advancements in preference …
preferences. This survey offers a thorough overview of recent advancements in preference …
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
Reinforcement learning from human feedback (RLHF) has proven effective in enhancing the
instruction-following capabilities of large language models; however, it remains …
instruction-following capabilities of large language models; however, it remains …
Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach
Self-improvement in multimodal large language models (MLLMs) is crucial for enhancing
their reliability and robustness. However, current methods often rely heavily on MLLMs …
their reliability and robustness. However, current methods often rely heavily on MLLMs …
Modality-fair preference optimization for trustworthy mllm alignment
Direct Preference Optimization (DPO) is effective for aligning large language models (LLMs),
but when applied to multimodal models (MLLMs), it often favors text over image information …
but when applied to multimodal models (MLLMs), it often favors text over image information …
Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal
Multimodal large language models (MLLMs) excel at multimodal perception and
understanding, yet their tendency to generate hallucinated or inaccurate responses …
understanding, yet their tendency to generate hallucinated or inaccurate responses …
Efficient and Scalable Large Multimodal Models
S Shen - 2024 - search.proquest.com
The rapid advancement of large multimodal models (LMMs) has revolutionized the field of
deep learning, enabling sophisticated understanding and generation across various …
deep learning, enabling sophisticated understanding and generation across various …
Controlling Multimodal LLMs via Reward-guided Decoding
As Multimodal Large Language Models (MLLMs) gain widespread applicability, it is
becoming increasingly desirable to adapt them for diverse user needs. In this paper, we …
becoming increasingly desirable to adapt them for diverse user needs. In this paper, we …