フォロー
David Brandfonbrener
David Brandfonbrener
Meta
確認したメール アドレス: meta.com - ホームページ
タイトル
引用先
引用先
Offline rl without off-policy evaluation
D Brandfonbrener, W Whitney, R Ranganath, J Bruna
Advances in neural information processing systems 34, 4933-4946, 2021
1842021
Frequentist regret bounds for randomized least-squares value iteration
A Zanette*, D Brandfonbrener*, E Brunskill, M Pirotta, A Lazaric
International Conference on Artificial Intelligence and Statistics, 1954-1964, 2020
1542020
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning
D Yarats*, D Brandfonbrener*, H Liu, M Laskin, P Abbeel, A Lazaric, ...
arXiv preprint arXiv:2201.13425, 2022
962022
When does return-conditioned supervised learning work for offline reinforcement learning?
D Brandfonbrener, A Bietti, J Buckman, R Laroche, J Bruna
Advances in Neural Information Processing Systems 35, 1542-1553, 2022
832022
Repeat after me: Transformers are better than state space models at copying
S Jelassi, D Brandfonbrener, SM Kakade, E Malach
arXiv preprint arXiv:2402.01032, 2024
532024
Psychrnn: An accessible and flexible python package for training recurrent neural network models on cognitive tasks
DB Ehrlich, JT Stone, D Brandfonbrener, A Atanasov, JD Murray
eneuro 8 (1), 2021
282021
Geometric insights into the convergence of nonlinear TD learning
D Brandfonbrener, J Bruna
International Conference on Learning Representations (ICLR), 2020
28*2020
Evaluating representations by the complexity of learning low-loss predictors
WF Whitney, MJ Song, D Brandfonbrener, J Altosaar, K Cho
arXiv preprint arXiv:2009.07368, 2020
272020
Inverse dynamics pretraining learns good representations for multitask imitation
D Brandfonbrener, O Nachum, J Bruna
Advances in Neural Information Processing Systems 36, 2023
162023
Offline Contextual Bandits with Overparameterized Models
D Brandfonbrener, WF Whitney, R Ranganath, J Bruna
International Conference on Machine Learning (ICML), 2021, 2020
16*2020
VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search
D Brandfonbrener, S Henniger, S Raja, T Prasad, C Loughridge, ...
arXiv preprint arXiv:2402.08147, 2024
13*2024
Soap: Improving and stabilizing shampoo using adam
N Vyas, D Morwani, R Zhao, I Shapira, D Brandfonbrener, L Janson, ...
arXiv preprint arXiv:2409.11321, 2024
112024
Deconstructing what makes a good optimizer for language models
R Zhao*, D Morwani*, D Brandfonbrener*, N Vyas*, S Kakade
arXiv preprint arXiv:2407.07972, 2024
102024
Visual backtracking teleoperation: A data collection protocol for offline image-based reinforcement learning
D Brandfonbrener, S Tu, A Singh, S Welker, C Boodoo, N Matni, J Varley
2023 IEEE International Conference on Robotics and Automation (ICRA), 11336 …, 2023
102023
Quantile filtered imitation learning
D Brandfonbrener, WF Whitney, R Ranganath, J Bruna
arXiv preprint arXiv:2112.00950, 2021
82021
Incorporating explicit uncertainty estimates into deep offline reinforcement learning
D Brandfonbrener, RT Combes, R Laroche
arXiv preprint arXiv:2206.01085, 2022
62022
Universal length generalization with turing programs
K Hou, D Brandfonbrener, S Kakade, S Jelassi, E Malach
arXiv preprint arXiv:2407.03310, 2024
42024
CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training
D Brandfonbrener, H Zhang, A Kirsch, JR Schwarz, S Kakade
arXiv preprint arXiv:2406.10670, 2024
42024
Mixture of parrots: Experts improve memorization more than reasoning
S Jelassi, C Mohri, D Brandfonbrener, A Gu, N Vyas, N Anand, ...
arXiv preprint arXiv:2410.19034, 2024
32024
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
K Li, S Jelassi, H Zhang, S Kakade, M Wattenberg, D Brandfonbrener
arXiv preprint arXiv:2402.14688, 2024
32024
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–20