DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs
Direct preference learning offers a promising and computation-efficient beyond supervised
fine-tuning (SFT) for improving code generation in coding large language models (LMs) …
fine-tuning (SFT) for improving code generation in coding large language models (LMs) …