Diffusion policy policy optimization

AZ Ren, J Lidard, LL Ankile, A Simeonov… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce Diffusion Policy Policy Optimization, DPPO, an algorithmic framework
including best practices for fine-tuning diffusion-based policies (eg Diffusion Policy) in …

Learning multimodal behaviors from scratch with diffusion policy gradient

Z Li, R Krohn, T Chen, A Ajay, P Agrawal… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep reinforcement learning (RL) algorithms typically parameterize the policy as a deep
network that outputs either a deterministic action or a stochastic one modeled as a Gaussian …

Scaling diffusion policy in transformer to 1 billion parameters for robotic manipulation

M Zhu, Y Zhu, J Li, J Wen, Z Xu, N Liu, R Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion Policy is a powerful technique tool for learning end-to-end visuomotor robot
control. It is expected that Diffusion Policy possesses scalability, a key attribute for deep …

Integrating reinforcement learning with foundation models for autonomous robotics: Methods and perspectives

A Moroncelli, V Soni, AA Shahid, M Maccarini… - arxiv preprint arxiv …, 2024 - arxiv.org
Foundation models (FMs), large deep learning models pre-trained on vast, unlabeled
datasets, exhibit powerful capabilities in understanding complex patterns and generating …

One-step diffusion policy: Fast visuomotor policies via diffusion distillation

Z Wang, Z Li, A Mandlekar, Z Xu, J Fan… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models, praised for their success in generative tasks, are increasingly being
applied to robotics, demonstrating exceptional performance in behavior cloning. However …

Discrete policy: Learning disentangled action space for multi-task robotic manipulation

K Wu, Y Zhu, J Li, J Wen, N Liu, Z Xu, Q Qiu… - arxiv preprint arxiv …, 2024 - arxiv.org
Learning visuomotor policy for multi-task robotic manipulation has been a long-standing
challenge for the robotics community. The difficulty lies in the diversity of action space …

Diffusion actor-critic with entropy regulator

Y Wang, L Wang, Y Jiang, W Zou, T Liu, X Song… - arxiv preprint arxiv …, 2024 - arxiv.org
Reinforcement learning (RL) has proven highly effective in addressing complex decision-
making and control tasks. However, in most traditional RL algorithms, the policy is typically …

Policy agnostic rl: Offline rl and online rl fine-tuning of any class and backbone

MS Mark, T Gao, GG Sampaio, MK Srirama… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advances in learning decision-making policies can largely be attributed to training
expressive policy models, largely via imitation learning. While imitation learning discards …

Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization

H Li, Z Jiang, Y Chen, D Zhao - arxiv preprint arxiv:2410.00051, 2024 - arxiv.org
With high-dimensional state spaces, visual reinforcement learning (RL) faces significant
challenges in exploitation and exploration, resulting in low sample efficiency and training …

Diffusion-based reinforcement learning via q-weighted variational policy optimization

S Ding, K Hu, Z Zhang, K Ren, W Zhang, J Yu… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models have garnered widespread attention in Reinforcement Learning (RL) for
their powerful expressiveness and multimodality. It has been verified that utilizing diffusion …