עקוב אחר
Cosmin Paduraru
Cosmin Paduraru
DeepMind
כתובת אימייל מאומתת בדומיין google.com
כותרת
צוטט על ידי
צוטט על ידי
שנה
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ...
arXiv preprint arXiv:2312.11805, 2023
32702023
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ...
arXiv preprint arXiv:2403.05530, 2024
12112024
Challenges of real-world reinforcement learning: definitions, benchmarks and analysis
G Dulac-Arnold, N Levine, DJ Mankowitz, J Li, C Paduraru, S Gowal, ...
Machine Learning 110 (9), 2419-2468, 2021
730*2021
Safe exploration in continuous action spaces
G Dalal, K Dvijotham, M Vecerik, T Hester, C Paduraru, Y Tassa
arXiv preprint arXiv:1801.08757, 2018
5782018
Rl unplugged: A suite of benchmarks for offline reinforcement learning
C Gulcehre, Z Wang, A Novikov, T Paine, S Gómez, K Zolna, R Agarwal, ...
Advances in Neural Information Processing Systems 33, 7248-7259, 2020
222*2020
Faster sorting algorithms discovered using deep reinforcement learning
DJ Mankowitz, A Michi, A Zhernov, M Gelmi, M Selvi, C Paduraru, ...
Nature 618 (7964), 257-263, 2023
2002023
Hyperparameter selection for offline reinforcement learning
TL Paine, C Paduraru, A Michi, C Gulcehre, K Zolna, A Novikov, Z Wang, ...
arXiv preprint arXiv:2007.09055, 2020
1752020
Benchmarks for deep off-policy evaluation
J Fu, M Norouzi, O Nachum, G Tucker, Z Wang, A Novikov, M Yang, ...
arXiv preprint arXiv:2103.16596, 2021
1032021
Training language models to self-correct via reinforcement learning
A Kumar, V Zhuang, R Agarwal, Y Su, JD Co-Reyes, A Singh, K Baumli, ...
arXiv preprint arXiv:2409.12917, 2024
592024
Coptidice: Offline constrained reinforcement learning via stationary distribution correction estimation
J Lee, C Paduraru, DJ Mankowitz, N Heess, D Precup, KE Kim, A Guez
arXiv preprint arXiv:2204.08957, 2022
592022
Autoregressive dynamics models for offline policy evaluation and optimization
MR Zhang, TL Paine, O Nachum, C Paduraru, G Tucker, Z Wang, ...
arXiv preprint arXiv:2104.13877, 2021
532021
Transformers meet directed graphs
S Geisler, Y Li, DJ Mankowitz, AT Cemgil, S Günnemann, C Paduraru
International conference on machine learning, 11144-11172, 2023
432023
Off-policy evaluation in Markov decision processes
C Paduraru
McGill University, 2013
432013
Controlling commercial cooling systems using reinforcement learning
J Luo, C Paduraru, O Voicu, Y Chervonyi, S Munns, J Li, C Qian, P Dutta, ...
arXiv preprint arXiv:2211.07357, 2022
362022
Responding to new information in a mining complex: Fast mechanisms using machine learning
C Paduraru, R Dimitrakopoulos
Mining Technology, 2019
342019
Adaptive policies for short-term material flow optimization in a mining complex
C Paduraru, R Dimitrakopoulos
Mining Technology 127 (1), 56-63, 2018
292018
Active offline policy selection
K Konyushova, Y Chen, T Paine, C Gulcehre, C Paduraru, DJ Mankowitz, ...
Advances in Neural Information Processing Systems 34, 24631-24644, 2021
282021
Off-policy learning with options and recognizers
D Precup, C Paduraru, A Koop, RS Sutton, S Singh
Advances in Neural Information Processing Systems 18, 2005
282005
Development and validation of a supervised machine learning radar Doppler spectra peak-finding algorithm
H Kalesse, T Vogl, C Paduraru, E Luke
Atmospheric Measurement Techniques 12 (8), 4591-4617, 2019
252019
Robust constrained reinforcement learning for continuous control with model misspecification
DJ Mankowitz, DA Calian, R Jeong, C Paduraru, N Heess, S Dathathri, ...
arXiv preprint arXiv:2010.10644, 2020
152020
המערכת אינה יכולה לבצע את הפעולה כעת. נסה שוב מאוחר יותר.
מאמרים 1–20