Nadav Merlis

Sitert av

	Alle	Siden 2020
Sitater	547	519
h-indeks	7	7
i10-indeks	7	7

120

201820192020202120222023202420254 23 78 101 96 115 113 16

Offentlig tilgang

Vis alle

5 artikler

0 artikler

tilgjengelige

ikke tilgjengelige

Basert på finansieringsmandater

Medforfattere

Shie MannorProfessor of Electrical Engineering @ Technion & Researcher @ NvidiaVerifisert e-postadresse på technion.ac.il
Yonathan EfroniMeta, New YorkVerifisert e-postadresse på fb.com
Daniel J. MankowitzGoogle DeepmindVerifisert e-postadresse på google.com
Matan HaroushTechnionVerifisert e-postadresse på campus.technion.ac.il
Tom ZahavyStaff Research Scientist, Google DeepMindVerifisert e-postadresse på deepmind.com
Chen TesslerResearch Scientist, NVIDIA ResearchVerifisert e-postadresse på nvidia.com
Vianney PerchetCrest, ENSAE & Criteo AI LabVerifisert e-postadresse på normalesup.org
Guy TennenholtzResearch Scientist, Google ResearchVerifisert e-postadresse på google.com
Lior ShaniGoogle ResearchVerifisert e-postadresse på google.com
Hugo RichardCRITEO AI LabsVerifisert e-postadresse på criteo.com
Mathieu MolinaInria - CREST ENSAEVerifisert e-postadresse på inria.fr
Dorian BaudryUniversity of OxfordVerifisert e-postadresse på stats.ox.ac.uk
Flore SentenacPhD Student, CRESTVerifisert e-postadresse på ensae.fr

Følg

Nadav Merlis

Assistant Professor @ Technion

Verifisert e-postadresse på technion.ac.il - Startside

Reinforcement Learning Multi-Armed Bandits


Tittel Sorter etter sitater Sorter etter år Sorter etter tittel	Sitert av Sitert av	År
Learn what not to learn: Action elimination with deep reinforcement learning T Zahavy, M Haroush, N Merlis, DJ Mankowitz, S Mannor arXiv preprint arXiv:1809.02121, 2018	262	2018
Tight regret bounds for model-based reinforcement learning with greedy policies Y Efroni, N Merlis, M Ghavamzadeh, S Mannor Advances in Neural Information Processing Systems 32, 2019	79	2019
Reinforcement learning with trajectory feedback Y Efroni, N Merlis, S Mannor Proceedings of the AAAI conference on artificial intelligence 35 (8), 7288-7295, 2021	56	2021
Ensemble bootstrapping for q-learning O Peer, C Tessler, N Merlis, R Meir International conference on machine learning, 8454-8463, 2021	47	2021
Batch-size independent regret bounds for the combinatorial multi-armed bandit problem N Merlis, S Mannor Conference on Learning Theory, 2465-2489, 2019	35	2019
Tight lower bounds for combinatorial multi-armed bandits N Merlis, S Mannor Conference on Learning Theory, 2830-2857, 2020	23	2020
Confidence-budget matching for sequential budgeted learning Y Efroni, N Merlis, A Saha, S Mannor International Conference on Machine Learning, 2937-2947, 2021	11	2021
Reinforcement learning with history dependent dynamic contexts G Tennenholtz, N Merlis, L Shani, M Mladenov, C Boutilier International Conference on Machine Learning, 34011-34053, 2023	7	2023
Lenient regret for multi-armed bandits N Merlis, S Mannor Proceedings of the AAAI Conference on Artificial Intelligence 35 (10), 8950-8957, 2021	7	2021
On preemption and learning in stochastic scheduling N Merlis, H Richard, F Sentenac, C Odic, M Molina, V Perchet International Conference on Machine Learning, 24478-24516, 2023	6	2023
Reinforcement learning with a terminator G Tennenholtz, N Merlis, L Shani, S Mannor, U Shalit, G Chechik, ... Advances in Neural Information Processing Systems 35, 35696-35709, 2022	4	2022
Never Worse, Mostly Better: Stable Policy Improvement in Deep Reinforcement Learning P Khanna, G Tennenholtz, N Merlis, S Mannor, C Tessler arXiv preprint arXiv:1910.01062, 2019	4*	2019
Multi-armed bandits with guaranteed revenue per arm D Baudry, N Merlis, MB Molina, H Richard, V Perchet International Conference on Artificial Intelligence and Statistics, 379-387, 2024	3	2024
Improved algorithms for contextual dynamic pricing M Tullii, S Gaucher, N Merlis, V Perchet Advances in Neural Information Processing Systems 37, 126088-126117, 2025	2	2025
The value of reward lookahead in reinforcement learning N Merlis, D Baudry, V Perchet Advances in Neural Information Processing Systems 37, 83627-83664, 2025	1	2025
Reinforcement Learning with Lookahead Information N Merlis Advances in Neural Information Processing Systems 37, 64523--64581, 2024		2024
Stable Matching with Ties: Approximation Ratios and Learning S Lin, S Mauras, N Merlis, V Perchet arXiv preprint arXiv:2411.03270, 2024		2024
On Bits and Bandits: Quantifying the Regret-Information Trade-off I Shufaro, N Merlis, N Weinberger, S Mannor arXiv preprint arXiv:2405.16581, 2024		2024
Ranking with Popularity Bias: User Welfare under Self-Amplification Dynamics G Tennenholtz, M Mladenov, N Merlis, RL Axtell, C Boutilier arXiv preprint arXiv:2305.18333, 2023		2023
Query-Reward Tradeoffs in Multi-Armed Bandits N Merlis, Y Efroni, S Mannor arXiv preprint arXiv:2110.05724, 2021		2021

Systemet kan ikke utføre handlingen. Prøv på nytt senere.

Artikler 1–20

Sitater per år

Duplikatsitater

Sammenslåtte sitater

Legg til medforfattereMedforfattere

Følg

Sitert av

Medforfattere