Unidexgrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy Y Xu, W Wan, J Zhang, H Liu, Z Shan, H Shen, R Wang, H Geng, Y Weng, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 95 | 2023 |
Dpo meets ppo: Reinforced token optimization for rlhf H Zhong, Z Shan, G Feng, W Xiong, X Cheng, L Zhao, D He, J Bian, ... arXiv preprint arXiv:2404.18922, 2024 | 37 | 2024 |