Neural policy gradient methods: Global optimality and rates of convergence L Wang, Q Cai, Z Yang, Z Wang International Conference on Learning Representations, 2020 | 271 | 2020 |
Pessimistic bootstrapping for uncertainty-driven offline reinforcement learning C Bai, L Wang, Z Yang, Z Deng, A Garg, P Liu, Z Wang International Conference on Learning Representations, 2022 | 162 | 2022 |
Provably efficient causal reinforcement learning with confounded observational data L Wang, Z Yang, Z Wang Advances in Neural Information Processing Systems 34, 21164-21175, 2021 | 68 | 2021 |
On the global optimality of model-agnostic meta-learning L Wang, Q Cai, Z Yang, Z Wang International conference on machine learning, 9837-9846, 2020 | 54 | 2020 |
Principled exploration via optimistic bootstrapping and backward induction C Bai, L Wang, L Han, J Hao, A Garg, P Liu, Z Wang International Conference on Machine Learning, 577-587, 2021 | 47 | 2021 |
Contrastive ucb: Provably efficient contrastive self-supervised learning in online reinforcement learning S Qiu, L Wang, C Bai, Z Yang, Z Wang International Conference on Machine Learning, 18168-18210, 2022 | 39 | 2022 |
Breaking the curse of many agents: Provable mean embedding q-iteration for mean-field reinforcement learning L Wang, Z Yang, Z Wang International conference on machine learning, 10092-10103, 2020 | 37 | 2020 |
Dynamic bottleneck for robust self-supervised exploration C Bai, L Wang, L Han, A Garg, J Hao, P Liu, Z Wang Advances in Neural Information Processing Systems 34, 17007-17020, 2021 | 33 | 2021 |
Variational dynamic for self-supervised exploration in deep reinforcement learning C Bai, P Liu, K Liu, L Wang, Y Zhao, L Han, Z Wang IEEE Transactions on neural networks and learning systems 34 (8), 4776-4790, 2021 | 19 | 2021 |
Addressing hindsight bias in multigoal reinforcement learning C Bai, L Wang, Y Wang, Z Wang, R Zhao, C Bai, P Liu IEEE Transactions on Cybernetics 53 (1), 392-405, 2021 | 17 | 2021 |
Monotonic quantile network for worst-case offline reinforcement learning C Bai, T Xiao, Z Zhu, L Wang, F Zhou, A Garg, B He, P Liu, Z Wang IEEE Transactions on Neural Networks and Learning Systems, 2022 | 13 | 2022 |
False correlation reduction for offline reinforcement learning Z Deng, Z Fu, L Wang, Z Yang, C Bai, T Zhou, Z Wang, J Jiang IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023 | 9 | 2023 |
Score: Spurious correlation reduction for offline reinforcement learning Z Deng, Z Fu, L Wang, Z Yang, C Bai, Z Wang, J Jiang arXiv preprint arXiv:2110.12468, 2021 | 8 | 2021 |
Represent to control partially observed systems: Representation learning with provable sample efficiency L Wang, Q Cai, Z Yang, Z Wang The Eleventh International Conference on Learning Representations, 2023 | 4 | 2023 |
Optimistic exploration with learned features provably solves markov decision processes with neural dynamics S Zheng, L Wang, S Qiu, Z Fu, Z Yang, C Szepesvari, Z Wang The Eleventh International Conference on Learning Representations, 2022 | 3 | 2022 |
Statistical-computational tradeoff in single index models L Wang, Z Yang, Z Wang Advances in neural information processing systems 32, 2019 | 3 | 2019 |
Privileged Knowledge Distillation for Sim-to-Real Policy Generalization H He, C Bai, H Lai, L Wang, W Zhang arXiv preprint arXiv:2305.18464, 2023 | 2 | 2023 |