Прати
Daniil Tiapkin
Daniil Tiapkin
Друга именаDaniil Tyapkin, Daniil Nikolaevich Tyapkin
Верификована је имејл адреса на polytechnique.edu - Почетна страница
Наслов
Навело
Навело
Година
Improved complexity bounds in wasserstein barycenter problem
D Dvinskikh, D Tiapkin
International Conference on Artificial Intelligence and Statistics, 1738-1746, 2021
272021
Generative Flow Networks as Entropy-Regularized RL
D Tiapkin, N Morozov, A Naumov, D Vetrov
AISTATS-2024, 2023
252023
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
D Tiapkin, D Belomestny, E Moulines, A Naumov, S Samsonov, Y Tang, ...
International Conference on Machine Learning, 21380-21431, 2022
202022
Fast Rates for Maximum Entropy Exploration
D Tiapkin, D Belomestny, D Calandriello, E Moulines, R Munos, ...
International Conference on Machine Learning, 2023
182023
Stochastic saddle-point optimization for the Wasserstein barycenter problem
D Tiapkin, A Gasnikov, P Dvurechensky
Optimization Letters 16 (7), 2145-2175, 2022
132022
Primal-Dual Stochastic Mirror Descent for MDPs
D Tiapkin, A Gasnikov
International Conference on Artificial Intelligence and Statistics, 9723-9740, 2022
132022
Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
D Tiapkin, D Belomestny, D Calandriello, E Moulines, R Munos, ...
Neural Information Processing Systems, 2022
112022
Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability
S Samsonov, D Tiapkin, A Naumov, E Moulines
The Thirty Seventh Annual Conference on Learning Theory, 4511-4547, 2024
9*2024
Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold
S Schechtman, D Tiapkin, M Muehlebach, E Moulines
The Thirty Sixth Annual Conference on Learning Theory, 1228-1258, 2023
92023
Demonstration-Regularized RL
D Tiapkin, D Belomestny, D Calandriello, E Moulines, A Naumov, ...
ICLR-2024, 2023
8*2023
Incentivized Learning in Principal-Agent Bandit Games
A Scheid, D Tiapkin, E Boursier, A Capitaine, EME Mhamdi, É Moulines, ...
arXiv preprint arXiv:2403.03811, 2024
62024
Model-free posterior sampling via learning rate randomization
D Tiapkin, D Belomestny, D Calandriello, E Moulines, R Munos, ...
Advances in Neural Information Processing Systems 36, 73719-73774, 2023
32023
First-Order Constrained Optimization: Non-smooth Dynamical System Viewpoint
S Schechtman, D Tiapkin, E Moulines, MI Jordan, M Muehlebach
IFAC-PapersOnLine 55 (16), 236-241, 2022
32022
Improving GFlowNets with Monte Carlo Tree Search
N Morozov, D Tiapkin, S Samsonov, A Naumov, D Vetrov
arXiv preprint arXiv:2406.13655, 2024
22024
Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents
S Labbi, D Tiapkin, L Mancini, P Mangold, E Moulines
arXiv preprint arXiv:2410.22908, 2024
12024
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
T Gritsaev, N Morozov, S Samsonov, D Tiapkin
arXiv preprint arXiv:2410.15474, 2024
12024
Revisiting Non-Acyclic GFlowNets in Discrete Environments
N Morozov, I Maksimov, D Tiapkin, S Samsonov
arXiv preprint arXiv:2502.07735, 2025
2025
On Teacher Hacking in Language Model Distillation
D Tiapkin, D Calandriello, J Ferret, S Perrin, N Vieillard, A Ramé, ...
arXiv preprint arXiv:2502.02671, 2025
2025
A New Bound on the Cumulant Generating Function of Dirichlet Processes
P Perrault, D Belomestny, P Ménard, É Moulines, A Naumov, D Tiapkin, ...
arXiv preprint arXiv:2409.18621, 2024
2024
Narrowing the Gap between Adversarial and Stochastic MDPs via Policy Optimization
D Tiapkin, E Chzhen, G Stoltz
arXiv preprint arXiv:2407.05704, 2024
2024
Систем тренутно не може да изврши ову радњу. Пробајте поново касније.
Чланци 1–20