David Brandfonbrener

引用先

	すべて	2020 年以来
引用	760	755
h 指標	11	11
i10 指標	14	14

320

160

240

20192020202120222023202420255 18 68 129 209 307 23

オープンアクセス

すべて表示

6 件の論文

0 件の論文

利用可能

利用不可

助成機関の要件に基づく

共著者

Joan BrunaProfessor of Computer Science, Data Science & Mathematics (aff), Courant Institute and CDS, NYU確認したメールアドレス: cims.nyu.edu
Sham M KakadeHarvard University確認したメールアドレス: seas.harvard.edu
William WhitneyDeepMind確認したメールアドレス: deepmind.com
Samy JelassiHarvard University確認したメールアドレス: fas.harvard.edu
Alessandro LazaricResearch Scientist, Facebook Artificial Intelligence Research確認したメールアドレス: inria.fr
Eran MalachKempner Institute, Harvard確認したメールアドレス: fas.harvard.edu
Matteo PirottaResearch Scientist, Meta (FAIR)確認したメールアドレス: fb.com
Andrea ZanetteAssistant Professor, Carnegie Mellon University確認したメールアドレス: andrew.cmu.edu
Denis YaratsCofounder and CTO, Perplexity AI確認したメールアドレス: perplexity.ai
Alberto BiettiFlatiron Institute, Simons Foundation確認したメールアドレス: nyu.edu
Ofir NachumOpenAI確認したメールアドレス: openai.com
Stephen TuUniversity of Southern California確認したメールアドレス: usc.edu
Jacob VarleyRobotics at Google DeepMind確認したメールアドレス: google.com

フォロー

David Brandfonbrener

Meta

確認したメールアドレス: meta.com - ホームページ

machine learning reinforcement learning language models


タイトル引用回数順公開年順タイトル順	引用先引用先	年
Offline rl without off-policy evaluation D Brandfonbrener, W Whitney, R Ranganath, J Bruna Advances in neural information processing systems 34, 4933-4946, 2021	184	2021
Frequentist regret bounds for randomized least-squares value iteration A Zanette, D Brandfonbrener, E Brunskill, M Pirotta, A Lazaric International Conference on Artificial Intelligence and Statistics, 1954-1964, 2020	154	2020
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning D Yarats, D Brandfonbrener, H Liu, M Laskin, P Abbeel, A Lazaric, ... arXiv preprint arXiv:2201.13425, 2022	96	2022
When does return-conditioned supervised learning work for offline reinforcement learning? D Brandfonbrener, A Bietti, J Buckman, R Laroche, J Bruna Advances in Neural Information Processing Systems 35, 1542-1553, 2022	83	2022
Repeat after me: Transformers are better than state space models at copying S Jelassi, D Brandfonbrener, SM Kakade, E Malach arXiv preprint arXiv:2402.01032, 2024	53	2024
Psychrnn: An accessible and flexible python package for training recurrent neural network models on cognitive tasks DB Ehrlich, JT Stone, D Brandfonbrener, A Atanasov, JD Murray eneuro 8 (1), 2021	28	2021
Geometric insights into the convergence of nonlinear TD learning D Brandfonbrener, J Bruna International Conference on Learning Representations (ICLR), 2020	28*	2020
Evaluating representations by the complexity of learning low-loss predictors WF Whitney, MJ Song, D Brandfonbrener, J Altosaar, K Cho arXiv preprint arXiv:2009.07368, 2020	27	2020
Inverse dynamics pretraining learns good representations for multitask imitation D Brandfonbrener, O Nachum, J Bruna Advances in Neural Information Processing Systems 36, 2023	16	2023
Offline Contextual Bandits with Overparameterized Models D Brandfonbrener, WF Whitney, R Ranganath, J Bruna International Conference on Machine Learning (ICML), 2021, 2020	16*	2020
VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search D Brandfonbrener, S Henniger, S Raja, T Prasad, C Loughridge, ... arXiv preprint arXiv:2402.08147, 2024	13*	2024
Soap: Improving and stabilizing shampoo using adam N Vyas, D Morwani, R Zhao, I Shapira, D Brandfonbrener, L Janson, ... arXiv preprint arXiv:2409.11321, 2024	11	2024
Deconstructing what makes a good optimizer for language models R Zhao, D Morwani, D Brandfonbrener, N Vyas, S Kakade arXiv preprint arXiv:2407.07972, 2024	10	2024
Visual backtracking teleoperation: A data collection protocol for offline image-based reinforcement learning D Brandfonbrener, S Tu, A Singh, S Welker, C Boodoo, N Matni, J Varley 2023 IEEE International Conference on Robotics and Automation (ICRA), 11336 …, 2023	10	2023
Quantile filtered imitation learning D Brandfonbrener, WF Whitney, R Ranganath, J Bruna arXiv preprint arXiv:2112.00950, 2021	8	2021
Incorporating explicit uncertainty estimates into deep offline reinforcement learning D Brandfonbrener, RT Combes, R Laroche arXiv preprint arXiv:2206.01085, 2022	6	2022
Universal length generalization with turing programs K Hou, D Brandfonbrener, S Kakade, S Jelassi, E Malach arXiv preprint arXiv:2407.03310, 2024	4	2024
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training D Brandfonbrener, H Zhang, A Kirsch, JR Schwarz, S Kakade arXiv preprint arXiv:2406.10670, 2024	4	2024
Mixture of parrots: Experts improve memorization more than reasoning S Jelassi, C Mohri, D Brandfonbrener, A Gu, N Vyas, N Anand, ... arXiv preprint arXiv:2410.19034, 2024	3	2024
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models K Li, S Jelassi, H Zhang, S Kakade, M Wattenberg, D Brandfonbrener arXiv preprint arXiv:2402.14688, 2024	3	2024

現在システムで処理を実行できません。しばらくしてからもう一度お試しください。

論文 1–20

年間引用数

重複した引用

結合された引用

共著者を追加共著者

フォロー

引用先

共著者