Prompting decision transformer for few-shot policy generalization
Human can leverage prior experience and learn novel tasks from a handful of
demonstrations. In contrast to offline meta-reinforcement learning, which aims to achieve …
demonstrations. In contrast to offline meta-reinforcement learning, which aims to achieve …
Social nce: Contrastive learning of socially-aware motion representations
Learning socially-aware motion representations is at the core of recent advances in multi-
agent problems, such as human motion forecasting and robot navigation in crowds. Despite …
agent problems, such as human motion forecasting and robot navigation in crowds. Despite …
State regularized policy optimization on data with dynamics shift
In many real-world scenarios, Reinforcement Learning (RL) algorithms are trained on data
with dynamics shift, ie, with different underlying environment dynamics. A majority of current …
with dynamics shift, ie, with different underlying environment dynamics. A majority of current …
[HTML][HTML] Machine learning meets advanced robotic manipulation
Automated industries lead to high quality production, lower manufacturing cost and better
utilization of human resources. Robotic manipulator arms have major role in the automation …
utilization of human resources. Robotic manipulator arms have major role in the automation …
Offline imitation learning with a misspecified simulator
In real-world decision-making tasks, learning an optimal policy without a trial-and-error
process is an appealing challenge. When expert demonstrations are available, imitation …
process is an appealing challenge. When expert demonstrations are available, imitation …
Multi-objective Deep Reinforcement Learning for Function Offloading in Serverless Edge Computing
Function offloading problems play a crucial role in optimizing the performance of
applications in serverless edge computing (SEC). Existing research has extensively …
applications in serverless edge computing (SEC). Existing research has extensively …
[PDF][PDF] Near on-policy experience sampling in multi-objective reinforcement learning
In multi-objective decision problems, the same state-action pair under different preference
weights between the objectives, constitutes different optimal policies. The introduction of …
weights between the objectives, constitutes different optimal policies. The introduction of …
Successive convex approximation based off-policy optimization for constrained reinforcement learning
C Tian, A Liu, G Huang, W Luo - IEEE Transactions on Signal …, 2022 - ieeexplore.ieee.org
Constrained reinforcement learning (CRL), also termed as safe reinforcement learning, is a
promising technique enabling the deployment of RL agent in real-world systems. In this …
promising technique enabling the deployment of RL agent in real-world systems. In this …
A Bi-objective Perspective on Controllable Language Models: Reward Dropout Improves Off-policy Control Performance
C Lee, C Lim - arxiv preprint arxiv:2310.04483, 2023 - arxiv.org
We study the theoretical aspects of CLMs (Controllable Language Models) from a bi-
objective optimization perspective. Specifically, we consider the CLMs as an off-policy RL …
objective optimization perspective. Specifically, we consider the CLMs as an off-policy RL …
[PDF][PDF] Building Adaptable Generalist Robots
M Xu - 2024 - kilthub.cmu.edu
Over the past decade, advancements in deep robot learning have enabled robots to acquire
remarkable capabilities. However, these robots often struggle to generalize to new, unseen …
remarkable capabilities. However, these robots often struggle to generalize to new, unseen …