Nonparametric return distribution approximation for reinforcement learning T Morimura, M Sugiyama, H Kashima, H Hachiya, T Tanaka Proceedings of the 27th International Conference on Machine Learning (ICML …, 2010 | 295 | 2010 |
Parametric return density estimation for reinforcement learning T Morimura, M Sugiyama, H Kashima, H Hachiya, T Tanaka arXiv preprint arXiv:1203.3497, 2012 | 146 | 2012 |
Map matching with hidden Markov model on sampled road network R Raymond, T Morimura, T Osogami, N Hirosue Proceedings of the 21st International Conference on Pattern Recognition …, 2012 | 81 | 2012 |
これからの強化学習 牧野, 貴樹, 澁谷, 長史, 白川, 浅田 (No Title), 2016 | 53 | 2016 |
Ibm mega traffic simulator T Osogami, T Imamichi, H Mizuta, T Morimura, R Raymond, T Suzumura, ... IBM Res., Tokyo, Japan, IBM Res. Rep. RT0896, 2012 | 46 | 2012 |
City-wide traffic flow estimation from a limited number of low-quality cameras T Idé, T Katsuki, T Morimura, R Morris IEEE Transactions on Intelligent Transportation Systems 18 (4), 950-959, 2016 | 44 | 2016 |
Utilizing the natural gradient in temporal difference reinforcement learning with eligibility traces T Morimura, E Uchibe, K Doya International Symposium on Information Geometry and Its Applications, 256-263, 2005 | 44 | 2005 |
Solving inverse problem of Markov chain with partial observations T Morimura, T Osogami, T Idé Advances in neural information processing systems 26, 2013 | 41 | 2013 |
Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning T Morimura, E Uchibe, J Yoshimoto, J Peters, K Doya Neural computation 22 (2), 342-376, 2010 | 37 | 2010 |
Assistance generation T Katsuki, T Morimura US Patent 10,878,337, 2020 | 29 | 2020 |
Updating policy parameters under Markov decision process system environment T Morimura, T Osogami, T Shirai US Patent 8,818,925, 2014 | 25 | 2014 |
A generalized natural actor-critic algorithm T Morimura, E Uchibe, J Yoshimoto, K Doya Advances in neural information processing systems 22, 2009 | 22 | 2009 |
強化学習 森村哲郎 講談社, 2019 | 21 | 2019 |
A new natural policy gradient by stationary distribution metric T Morimura, E Uchibe, J Yoshimoto, K Doya Machine Learning and Knowledge Discovery in Databases: European Conference …, 2008 | 21 | 2008 |
Cooperative neural network reinforcement learning S Dasgupta, T Morimura, T Osogami US Patent App. 15/647,543, 2019 | 18 | 2019 |
Adaptive step-size policy gradients with average reward metric T Matsubara, T Morimura, J Morimoto Proceedings of 2nd Asian Conference on Machine Learning, 285-298, 2010 | 15 | 2010 |
Determining optimal action in consideration of risk T Morimura, T Osogami US Patent 8,639,556, 2014 | 14 | 2014 |
A consistent method for graph based anomaly localization S Hara, T Morimura, T Takahashi, H Yanagisawa, T Suzuki Artificial intelligence and statistics, 333-341, 2015 | 13 | 2015 |
Statistical origin-destination generation with multiple sources T Morimura, S Kato Proceedings of the 21st International Conference on Pattern Recognition …, 2012 | 13 | 2012 |
Filtered direct preference optimization T Morimura, M Sakamoto, Y Jinnai, K Abe, K Ariu arXiv preprint arXiv:2404.13846, 2024 | 12 | 2024 |