Følg
Kenshi Abe
Kenshi Abe
CyberAgent, Inc.
Verifisert e-postadresse på cyberagent.co.jp - Startside
Tittel
Sitert av
Sitert av
År
Thresholded lasso bandit
K Ariu, K Abe, A Proutière
International Conference on Machine Learning (ICML 2022), 2022
262022
Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games
K Abe, Y Kaneko
International Conference on Autonomous Agents and Multiagent Systems (AAMAS …, 2021
212021
Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games
K Abe, K Ariu, M Sakamoto, K Toyoshima, A Iwasaki
International Conference on Artificial Intelligence and Statistics (AISTATS …, 2023
202023
Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games
K Abe, M Sakamoto, A Iwasaki
Conference on Uncertainty in Artificial Intelligence (UAI 2022), 2022
182022
Filtered direct preference optimization
T Morimura, M Sakamoto, Y Jinnai, K Abe, K Air
Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), 2024
122024
Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search
K Abe, J Komiyama, A Iwasaki
International Joint Conference on Artificial Intelligence (IJCAI 2022), 2022
102022
Regularized best-of-n sampling to mitigate reward hacking for language model alignment
Y Jinnai, T Morimura, K Ariu, K Abe
ICML 2024 Workshop on Models of Human Feedback for AI Alignment, 2024
82024
Adaptively Perturbed Mirror Descent for Learning in Games
K Abe, K Ariu, M Sakamoto, A Iwasaki
International Conference on Machine Learning (ICML 2024), 2024
52024
Learning Fair Division from Bandit Feedback
H Yamada, J Komiyama, K Abe, A Iwasaki
International Conference on Artificial Intelligence and Statistics (AISTATS …, 2024
52024
Model-based minimum bayes risk decoding
Y Jinnai, T Morimura, U Honda, K Ariu, K Abe
International Conference on Machine Learning (ICML 2024), 2024
4*2024
Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systems
R Togashi, K Abe, Y Saito
The Web Conference (WWW 2024), 2024
4*2024
Mean-variance efficient reinforcement learning by expected quadratic utility maximization
M Kato, K Nakagawa, K Abe, T Morimura
arXiv preprint arXiv:2010.01404, 2020
4*2020
Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium
Y Fujimoto, K Ariu, K Abe
International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023
32023
A practical guide of off-policy evaluation for bandit problems
M Kato, K Abe, K Ariu, S Yasui
arXiv preprint arXiv:2010.12470, 2020
32020
Online Learning for Bidding Agent in First Price Auction
G Morishita, K Abe, K Ogawa, Y Kaneko
AAAI-20 Workshop on Reinforcement Learning in Games, 2020
32020
Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processes
T Morimura, K Ota, K Abe, P Zhang
Reinforcement Learning Conference (RLC 2024), 2024
22024
Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games
Y Fujimoto, K Ariu, K Abe
Annual AAAI Conference on Artificial Intelligence (AAAI 2024), 2024
22024
Return-Aligned Decision Transformer
T Tanaka, K Abe, K Ariu, T Morimura, E Simo-Serra
arXiv preprint arXiv:2402.03923, 2024
22024
Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry
Y Fujimoto, K Ariu, K Abe
International Conference on Autonomous Agents and Multiagent Systems (AAMAS …, 2025
12025
Nash Equilibrium and Learning Dynamics in Three-Player Matching -Action Games
Y Fujimoto, K Ariu, K Abe
International Conference on Autonomous Agents and Multiagent Systems (AAMAS …, 2025
12025
Systemet kan ikke utføre handlingen. Prøv på nytt senere.
Artikler 1–20