Multimodal trajectory optimization for motion planning
T Osa - The International Journal of Robotics Research, 2020 - journals.sagepub.com
Existing motion planning methods often have two drawbacks:(1) goal configurations need to
be specified by a user, and (2) only a single solution is generated under a given condition. In …
be specified by a user, and (2) only a single solution is generated under a given condition. In …
Hierarchical reinforcement learning with adaptive scheduling for robot control
Z Huang, Q Liu, F Zhu - Engineering Applications of Artificial Intelligence, 2023 - Elsevier
Conventional hierarchical reinforcement learning (HRL) relies on discrete options to
represent explicitly distinguishable knowledge, which may lead to severe performance …
represent explicitly distinguishable knowledge, which may lead to severe performance …
Hierarchical reinforcement learning for quadruped locomotion
Legged locomotion is a challenging task for learning algorithms, especially when the task
requires a diverse set of primitive behaviors. To solve these problems, we introduce a …
requires a diverse set of primitive behaviors. To solve these problems, we introduce a …
Reparameterized policy learning for multimodal trajectory optimization
We investigate the challenge of parametrizing policies for reinforcement learning (RL) in
high-dimensional continuous action spaces. Our objective is to develop a multimodal policy …
high-dimensional continuous action spaces. Our objective is to develop a multimodal policy …
Discovering diverse solutions in deep reinforcement learning by maximizing state–action-based mutual information
Reinforcement learning algorithms are typically limited to learning a single solution for a
specified task, even though diverse solutions often exist. Recent studies showed that …
specified task, even though diverse solutions often exist. Recent studies showed that …
Learning compositional neural programs with recursive tree search and planning
We propose a novel reinforcement learning algorithm, AlphaNPI, that incorpo-rates the
strengths of Neural Programmer-Interpreters (NPI) and AlphaZero. NPI contributes structural …
strengths of Neural Programmer-Interpreters (NPI) and AlphaZero. NPI contributes structural …
Motion planning by learning the solution manifold in trajectory optimization
T Osa - The International Journal of Robotics Research, 2022 - journals.sagepub.com
The objective function used in trajectory optimization is often non-convex and can have an
infinite set of local optima. In such cases, there are diverse solutions to perform a given task …
infinite set of local optima. In such cases, there are diverse solutions to perform a given task …
Spatial memory-augmented visual navigation based on hierarchical deep reinforcement learning in unknown environments
Visual navigation in unknown environments poses significant challenges due to the
presence of many obstacles and low-texture scenes. These factors may cause frequent …
presence of many obstacles and low-texture scenes. These factors may cause frequent …
Multipolar: Multi-source policy aggregation for transfer reinforcement learning between diverse environmental dynamics
Transfer reinforcement learning (RL) aims at improving the learning efficiency of an agent by
exploiting knowledge from other source agents trained on relevant tasks. However, it …
exploiting knowledge from other source agents trained on relevant tasks. However, it …
Reinforcement learning from hierarchical critics
In this study, we investigate the use of global information to speed up the learning process
and increase the cumulative rewards of reinforcement learning (RL) in competition tasks …
and increase the cumulative rewards of reinforcement learning (RL) in competition tasks …