How to leverage demonstration data in alignment for large language model? a self-imitation learning perspective
This paper introduces a novel generalized self-imitation learning ($\textbf {GSIL} $)
framework, which effectively and efficiently aligns large language models with offline …
framework, which effectively and efficiently aligns large language models with offline …
Fact-Level Confidence Calibration and Self-Correction
Confidence calibration in LLMs, ie, aligning their self-assessed confidence with the actual
accuracy of their responses, enabling them to self-evaluate the correctness of their outputs …
accuracy of their responses, enabling them to self-evaluate the correctness of their outputs …
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Existing preference optimization objectives for language model alignment require additional
hyperparameters that must be extensively tuned to achieve optimal performance, increasing …
hyperparameters that must be extensively tuned to achieve optimal performance, increasing …
GeomCLIP: Contrastive Geometry-Text Pre-training for Molecules
Pretraining molecular representations is crucial for drug and material discovery. Recent
methods focus on learning representations from geometric structures, effectively capturing …
methods focus on learning representations from geometric structures, effectively capturing …
MITA: Bridging the Gap between Model and Data for Test-time Adaptation
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the
generalizability of models. However, existing mainstream TTA methods, predominantly …
generalizability of models. However, existing mainstream TTA methods, predominantly …