Prompting decision transformer for few-shot policy generalization

M Xu, Y Shen, S Zhang, Y Lu, D Zhao… - international …, 2022 - proceedings.mlr.press
Human can leverage prior experience and learn novel tasks from a handful of
demonstrations. In contrast to offline meta-reinforcement learning, which aims to achieve …

Social nce: Contrastive learning of socially-aware motion representations

Y Liu, Q Yan, A Alahi - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Learning socially-aware motion representations is at the core of recent advances in multi-
agent problems, such as human motion forecasting and robot navigation in crowds. Despite …

State regularized policy optimization on data with dynamics shift

Z Xue, Q Cai, S Liu, D Zheng… - Advances in neural …, 2024 - proceedings.neurips.cc
In many real-world scenarios, Reinforcement Learning (RL) algorithms are trained on data
with dynamics shift, ie, with different underlying environment dynamics. A majority of current …

[HTML][HTML] Machine learning meets advanced robotic manipulation

S Nahavandi, R Alizadehsani, D Nahavandi, CP Lim… - Information …, 2024 - Elsevier
Automated industries lead to high quality production, lower manufacturing cost and better
utilization of human resources. Robotic manipulator arms have major role in the automation …

Offline imitation learning with a misspecified simulator

S Jiang, J Pang, Y Yu - Advances in neural information …, 2020 - proceedings.neurips.cc
In real-world decision-making tasks, learning an optimal policy without a trial-and-error
process is an appealing challenge. When expert demonstrations are available, imitation …

Multi-objective Deep Reinforcement Learning for Function Offloading in Serverless Edge Computing

Y Yang, X Du, Y Ye, J Ding, T Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Function offloading problems play a crucial role in optimizing the performance of
applications in serverless edge computing (SEC). Existing research has extensively …

[PDF][PDF] Near on-policy experience sampling in multi-objective reinforcement learning

S Wang, M Reymond, AA Irissappane… - Proceedings of the …, 2022 - aamas.csc.liv.ac.uk
In multi-objective decision problems, the same state-action pair under different preference
weights between the objectives, constitutes different optimal policies. The introduction of …

Successive convex approximation based off-policy optimization for constrained reinforcement learning

C Tian, A Liu, G Huang, W Luo - IEEE Transactions on Signal …, 2022 - ieeexplore.ieee.org
Constrained reinforcement learning (CRL), also termed as safe reinforcement learning, is a
promising technique enabling the deployment of RL agent in real-world systems. In this …

A Bi-objective Perspective on Controllable Language Models: Reward Dropout Improves Off-policy Control Performance

C Lee, C Lim - arxiv preprint arxiv:2310.04483, 2023 - arxiv.org
We study the theoretical aspects of CLMs (Controllable Language Models) from a bi-
objective optimization perspective. Specifically, we consider the CLMs as an off-policy RL …

[PDF][PDF] Building Adaptable Generalist Robots

M Xu - 2024 - kilthub.cmu.edu
Over the past decade, advancements in deep robot learning have enabled robots to acquire
remarkable capabilities. However, these robots often struggle to generalize to new, unseen …