Hiteshi Sharma

نقل شده توسط

	همهٔ موارد	از 2020
نقل‌‏قول‌‏ها	1282	1256
شاخص h	10	10
شاخص i10	10	10

860

430

215

645

201820192020202120222023202420256 14 26 25 27 78 853 242

دسترسی عمومی

مشاهدهٔ همه

۸ مقاله

۰ مقاله

در دسترس

در دسترس نیست

براساس دستورات هزینه انتشار

دنبال کردن

Hiteshi Sharma

Microsoft

ایمیل تأیید شده در microsoft.com

Machine learning reinforcement learning online learning


عنوان به‌ترتیب نقل قول‌ها به‌ترتیب سال به‌ترتیب عنوان	نقل شده توسط نقل شده توسط	سال
Phi-3 technical report: A highly capable language model locally on your phone‏ M Abdin, J Aneja, H Awadalla, A Awadallah, AA Awan, N Bach, A Bahree, ...‏ arXiv preprint arXiv:2404.14219, 2024‏	906	2024
Model-free reinforcement learning in infinite-horizon average-reward markov decision processes‏ CY Wei, MJ Jahromi, H Luo, H Sharma, R Jain‏ International conference on machine learning, 10170-10180, 2020‏	123	2020
Evaluating cognitive maps and planning in large language models with cogeval‏ I Momennejad, H Hasanbeig, F Vieira Frujeri, H Sharma, N Jojic, ...‏ Advances in Neural Information Processing Systems 36, 69736-69751, 2023‏	63*	2023
Fine-tuning language models with advantage-induced policy alignment‏ B Zhu, H Sharma, FV Frujeri, S Dong, C Zhu, MI Jordan, J Jiao‏ arXiv preprint arXiv:2306.02231, 2023‏	41	2023
Self-exploring language models: Active preference elicitation for online alignment‏ S Zhang, D Yu, H Sharma, H Zhong, Z Liu, Z Yang, S Wang, H Hassan, ...‏ arXiv preprint arXiv:2405.19332, 2024‏	27	2024
Language models can be logical solvers‏ J Feng, R Xu, J Hao, H Sharma, Y Shen, D Zhao, W Chen‏ Findings of the Association for Computational Linguistics: NAACL 2024, 2023‏	22*	2023
A universal empirical dynamic programming algorithm for continuous state MDPs‏ WB Haskell, R Jain, H Sharma, P Yu‏ IEEE Transactions on Automatic Control 65 (1), 115-129, 2019‏	21	2019
Allure: A systematic protocol for auditing and improving llm-based evaluation of text using iterative in-context-learning‏ H Hasanbeig, H Sharma, L Betthauser, FV Frujeri, I Momennejad‏ arXiv preprint arXiv:2309.13701 3, 2023‏	17	2023
Approximate relative value learning for average-reward continuous state MDPs‏ H Sharma, M Jafarnia-Jahromi, R Jain‏ Uncertainty in Artificial Intelligence, 956-964, 2020‏	16	2020
An empirical relative value learning algorithm for non-parametric MDPs with continuous state space‏ H Sharma, R Jain, A Gupta‏ 2019 18th European Control Conference (ECC), 1368-1373, 2019‏	13	2019
Randomized function fitting-based empirical value iteration‏ WB Haskell, P Yu, H Sharma, R Jain‏ 2017 IEEE 56th Annual Conference on Decision and Control (CDC), 2467-2472, 2017‏	9	2017
An approximately optimal relative value learning algorithm for averaged MDPs with continuous states and actions‏ H Sharma, R Jain‏ 2019 57th Annual Allerton Conference on Communication, Control, and …, 2019‏	7	2019
Phi-3 safety post-training: Aligning language models with a" break-fix" cycle‏ E Haider, D Perez-Becker, T Portet, P Madan, A Garg, A Ashfaq, ...‏ arXiv preprint arXiv:2407.13833, 2024‏	5	2024
Optimal spectrum sensing for cognitive radio with imperfect detector‏ H Sharma, A Patel, SN Merchant, UB Desai‏ 2014 IEEE 79th Vehicular Technology Conference (VTC Spring), 1-5, 2014‏	4	2014
An empirical dynamic programming algorithm for continuous MDPs‏ WB Haskell, R Jain, H Sharma, P Yu‏ arXiv preprint arXiv:1709.07506, 2017‏	3	2017
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning‏ Y Chen, S Wang, Z Yang, H Sharma, N Karampatziakis, D Yu, ...‏ arXiv preprint arXiv:2407.02119, 2024‏	1	2024
Finite Time Guarantees for Continuous State MDPs with Generative Model‏ H Sharma, R Jain‏ 2020 59th IEEE Conference on Decision and Control (CDC), 3617-3622, 2020‏	1	2020
Randomized Policy Learning for Continuous State and Action MDPs‏ H Sharma, R Jain‏ arXiv preprint arXiv:2006.04331, 2020‏	1	2020
Empirical algorithms for general stochastic systems with continuous states and actions‏ H Sharma, R Jain, W Haskell‏ 2019 IEEE 58th Conference on Decision and Control (CDC), 6344-6349, 2019‏	1	2019
QoS aware optimal base station ON/OFF policy and frequency planning‏ H Sharma, V Vaid, P Chaporkar, GS Kasbekar‏ Indian Inst. Technol. Bombay, 2015‏	1	2015

سیستم در حال حاضر قادر به انجام عملکرد نیست. بعداً دوباره امتحان کنید.

مقاله‌ها 1–20

نقل‌قول‌ها در سال

نقل‌قول تکراری

نقل‌قول‌های ادغام شده

افزودن نویسنده‌های همکارنویسندگان مشترک

دنبال کردن

نقل شده توسط