- Academic Search

記事

Scholar

1 件（0.01 秒）

プロフィールマイライブラリ

TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights

引用している記事内を検索

[Free GPT-4]

[PDF] arxiv.org

Fantastic LLMs for Preference Data Annotation and How to (not) Find Them

G Xu, K Xu, S Sudalairaj, H Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Preference tuning of large language models (LLMs) relies on high-quality human preference
data, which is often expensive and time-consuming to gather. While existing methods can …

保存引用関連記事 HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights

Fantastic LLMs for Preference Data Annotation and How to (not) Find Them