Theoretical comparisons of positive-unlabeled learning against positive-negative learning G Niu, MC Du Plessis, T Sakai, Y Ma, M Sugiyama Advances in neural information processing systems 29, 2016 | 148 | 2016 |
A policy search method for temporal logic specified reinforcement learning tasks X Li, Y Ma, C Belta 2018 Annual American Control Conference (ACC), 240-245, 2018 | 86 | 2018 |
Hybrid constraint SVR for facial age estimation J Liu, Y Ma, L Duan, F Wang, Y Liu Signal Processing 94, 576-582, 2014 | 45 | 2014 |
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers Y Ma, A Olshevsky, C Szepesvári, V Saligrama Journal of Machine Learning Research 21 (133), 1-36, 2020 | 29 | 2020 |
Bandit-based task assignment for heterogeneous crowdsourcing H Zhang, Y Ma, M Sugiyama Neural computation 27 (11), 2447-2475, 2015 | 21 | 2015 |
Automata guided reinforcement learning with demonstrations X Li, Y Ma, C Belta arXiv preprint arXiv:1809.06305, 2018 | 15 | 2018 |
Double layer multiple task learning for age estimation with insufficient training samples Y Ma, J Liu, X Yang, Y Liu, N Zheng Neurocomputing 147, 380-386, 2015 | 12 | 2015 |
Facial age estimation from web photos using multiple-instance learning X Yang, J Liu, Y Ma, J Xue 2014 IEEE international conference on multimedia and expo (ICME), 1-6, 2014 | 11 | 2014 |
An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions Y Ma, T Zhao, K Hatano, M Sugiyama ECML PKDD 2014, 2014 | 8 | 2014 |
Crowdsourcing with sparsely interacting workers Y Ma, A Olshevsky, V Saligrama, C Szepesvari arXiv preprint arXiv:1706.06660, 2017 | 6 | 2017 |
Automata guided hierarchical reinforcement learning for zero-shot skill composition X Li, Y Ma, C Belta | 5 | 2018 |
Automata-guided hierarchical reinforcement learning for skill composition X Li, Y Ma, C Belta arXiv preprint arXiv:1711.00129, 2017 | 4 | 2017 |
Online Markov decision processes with policy iteration Y Ma, H Zhang, M Sugiyama arXiv preprint arXiv:1510.04454, 2015 | 3 | 2015 |
Gradual fine-tuning with graph routing for multi-source unsupervised domain adaptation Y Ma, S Louvan, Z Wang arXiv preprint arXiv:2411.07185, 2024 | | 2024 |
Efficient Pointwise-Pairwise Learning-to-Rank for News Recommendation N Kannen, Y Ma, GJJ van den Burg, JB Faddoul arXiv preprint arXiv:2409.17711, 2024 | | 2024 |
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers Y Ma, A Olshevsky, C Szepesvári, V Saligrama ICML 2018, 2018 | | 2018 |
AUTOMATA GUIDED HIERARCHICAL REINFORCE-MENT LEARNING FOR ZERO-SHOT SKILL COMPOSI X Li, Y Ma, C Belta arXiv preprint arXiv:1711.00129, 2017 | | 2017 |
Online decision making in non-stationary Markovian environments Y Ma (No Title), 2015 | | 2015 |
An Online Policy Gradient Algorithm for Continuous State and Action Markov Decision Processes with Bandit Feedback MA Yao, M Sugiyama 電子情報通信学会技術研究報告= IEICE technical report: 信学技報 114 (306 …, 2014 | | 2014 |
Automata Guided Skill Composition X Li, Y Ma, C Belta | | |