Подписаться
Alex Ayoub
Alex Ayoub
Department of Computing Science, University of Alberta
Подтвержден адрес электронной почты в домене ualberta.ca
Название
Процитировано
Процитировано
Год
Model-based reinforcement learning with value-targeted regression
A Ayoub, Z Jia, C Szepesvari, M Wang, L Yang
International Conference on Machine Learning, 463-474, 2020
3502020
Randomized exploration in reinforcement learning with general value function approximation
H Ishfaq, Q Cui, V Nguyen, A Ayoub, Z Yang, Z Wang, D Precup, L Yang
International Conference on Machine Learning, 4607-4616, 2021
462021
An elementary proof that Q-learning converges almost surely
MT Regehr, A Ayoub
arXiv preprint arXiv:2108.02827, 2021
112021
Exploration via linearly perturbed loss minimisation
D Janz, S Liu, A Ayoub, C Szepesvári
International Conference on Artificial Intelligence and Statistics, 721-729, 2024
42024
Managing temporal resolution in continuous value estimation: A fundamental trade-off
ZV Zhang, J Kirschner, J Zhang, F Zanini, A Ayoub, M Dehghan, ...
Advances in Neural Information Processing Systems 36, 2024
42024
Switching the Loss Reduces the Cost in Batch Reinforcement Learning
A Ayoub, K Wang, V Liu, S Robertson, J McInerney, D Liang, N Kallus, ...
arXiv preprint arXiv:2403.05385, 2024
32024
Mitigating the curse of horizon in Monte-Carlo returns
A Ayoub, D Szepesvari, F Zanini, B Chan, D Gupta, BC da Silva, ...
Reinforcement Learning Conference, 2024
12024
Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits
S Liu, A Ayoub, F Sentenac, X Tan, C Szepesvári
arXiv preprint arXiv:2410.01112, 2024
2024
Towards Sample Efficient Reinforcement Learning with Function Approximation
A Ayoub
2021
Does weighting improve matrix factorization for recommender systems?
A Ayoub, S Robertson, D Liang, H Steck, N Kallus
THE WEB CONFERENCE 2025, 0
Resmax: An Alternative Soft-Greedy Operator for Reinforcement Learning
E Miahi, R MacQueen, A Ayoub, A Masoumzadeh, M White
Transactions on Machine Learning Research, 0
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–11