Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts

T Fu, Y Hou, J McAuley, R Yan - arxiv preprint arxiv:2408.05094, 2024 - arxiv.org
The task of multi-objective alignment aims at balancing and controlling the different
alignment objectives (eg, helpfulness, harmlessness and honesty) of large language models …