Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
The inadequacy of reinforcement learning from human feedback-radicalizing large language models via semantic vulnerabilities
This study is an empirical investigation into the semantic vulnerabilities of four popular
pretrained commercial large language models (LLMs) to ideological manipulation. Using …
pretrained commercial large language models (LLMs) to ideological manipulation. Using …
Stepcoder: Improve code generation with reinforcement learning from compiler feedback
The advancement of large language models (LLMs) has significantly propelled the field of
code generation. Previous work integrated reinforcement learning (RL) with compiler …
code generation. Previous work integrated reinforcement learning (RL) with compiler …
Direct preference optimization using sparse feature-level constraints
The alignment of large language models (LLMs) with human preferences remains a key
challenge. While post-training techniques like Reinforcement Learning from Human …
challenge. While post-training techniques like Reinforcement Learning from Human …
The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
This work identifies the Energy Loss Phenomenon in Reinforcement Learning from Human
Feedback (RLHF) and its connection to reward hacking. Specifically, energy loss in the final …
Feedback (RLHF) and its connection to reward hacking. Specifically, energy loss in the final …
Improving DeFi Accessibility through Efficient Liquidity Provisioning with Deep Reinforcement Learning
H Xu, A Brini - arxiv preprint arxiv:2501.07508, 2025 - arxiv.org
This paper applies deep reinforcement learning (DRL) to optimize liquidity provisioning in
Uniswap v3, a decentralized finance (DeFi) protocol implementing an automated market …
Uniswap v3, a decentralized finance (DeFi) protocol implementing an automated market …
Ensuring trustworthy code: leveraging a static analyzer to identify and mitigate defects in generated code
The rise of large language models (LLMs) has greatly advanced code generation
capabilities. A recent StackOverflow survey found that 70% of developers are using or …
capabilities. A recent StackOverflow survey found that 70% of developers are using or …