Multimodal trajectory optimization for motion planning

T Osa - The International Journal of Robotics Research, 2020 - journals.sagepub.com
Existing motion planning methods often have two drawbacks:(1) goal configurations need to
be specified by a user, and (2) only a single solution is generated under a given condition. In …

Hierarchical reinforcement learning with adaptive scheduling for robot control

Z Huang, Q Liu, F Zhu - Engineering Applications of Artificial Intelligence, 2023 - Elsevier
Conventional hierarchical reinforcement learning (HRL) relies on discrete options to
represent explicitly distinguishable knowledge, which may lead to severe performance …

Hierarchical reinforcement learning for quadruped locomotion

D Jain, A Iscen, K Caluwaerts - 2019 IEEE/RSJ International …, 2019 - ieeexplore.ieee.org
Legged locomotion is a challenging task for learning algorithms, especially when the task
requires a diverse set of primitive behaviors. To solve these problems, we introduce a …

Reparameterized policy learning for multimodal trajectory optimization

Z Huang, L Liang, Z Ling, X Li… - … on Machine Learning, 2023 - proceedings.mlr.press
We investigate the challenge of parametrizing policies for reinforcement learning (RL) in
high-dimensional continuous action spaces. Our objective is to develop a multimodal policy …

Discovering diverse solutions in deep reinforcement learning by maximizing state–action-based mutual information

T Osa, V Tangkaratt, M Sugiyama - Neural Networks, 2022 - Elsevier
Reinforcement learning algorithms are typically limited to learning a single solution for a
specified task, even though diverse solutions often exist. Recent studies showed that …

Learning compositional neural programs with recursive tree search and planning

T Pierrot, G Ligner, SE Reed… - Advances in …, 2019 - proceedings.neurips.cc
We propose a novel reinforcement learning algorithm, AlphaNPI, that incorpo-rates the
strengths of Neural Programmer-Interpreters (NPI) and AlphaZero. NPI contributes structural …

Motion planning by learning the solution manifold in trajectory optimization

T Osa - The International Journal of Robotics Research, 2022 - journals.sagepub.com
The objective function used in trajectory optimization is often non-convex and can have an
infinite set of local optima. In such cases, there are diverse solutions to perform a given task …

Spatial memory-augmented visual navigation based on hierarchical deep reinforcement learning in unknown environments

S **, X Wang, Q Meng - Knowledge-Based Systems, 2024 - Elsevier
Visual navigation in unknown environments poses significant challenges due to the
presence of many obstacles and low-texture scenes. These factors may cause frequent …

Multipolar: Multi-source policy aggregation for transfer reinforcement learning between diverse environmental dynamics

M Barekatain, R Yonetani, M Hamaya - arxiv preprint arxiv:1909.13111, 2019 - arxiv.org
Transfer reinforcement learning (RL) aims at improving the learning efficiency of an agent by
exploiting knowledge from other source agents trained on relevant tasks. However, it …

Reinforcement learning from hierarchical critics

Z Cao, CT Lin - IEEE Transactions on Neural Networks and …, 2021 - ieeexplore.ieee.org
In this study, we investigate the use of global information to speed up the learning process
and increase the cumulative rewards of reinforcement learning (RL) in competition tasks …