متابعة
Olivia Watkins
Olivia Watkins
بريد إلكتروني تم التحقق منه على berkeley.edu
عنوان
عدد مرات الاقتباسات
عدد مرات الاقتباسات
السنة
Aligning text-to-image models using human feedback
K Lee, H Liu, M Ryu, O Watkins, Y Du, C Boutilier, P Abbeel, ...
arXiv preprint arXiv:2302.12192, 2023
2392023
Guiding pretraining in reinforcement learning with large language models
Y Du, O Watkins, Z Wang, C Colas, T Darrell, P Abbeel, A Gupta, ...
International Conference on Machine Learning, 8657-8677, 2023
2062023
Reinforcement learning for fine-tuning text-to-image diffusion models
Y Fan, O Watkins, Y Du, H Liu, M Ryu, C Boutilier, P Abbeel, ...
Advances in Neural Information Processing Systems 36, 2024
1882024
Gpt-4o system card
A Hurst, A Lerer, AP Goucher, A Perelman, A Ramesh, A Clark, AJ Ostrow, ...
arXiv preprint arXiv:2410.21276, 2024
1522024
Auto-tuned sim-to-real transfer
Y Du, O Watkins, T Darrell, P Abbeel, D Pathak
2021 IEEE International Conference on Robotics and Automation (ICRA), 1290-1296, 2021
882021
Tensor trust: Interpretable prompt injection attacks from an online game
S Toyer, O Watkins, EA Mendes, J Svegliato, L Bailey, T Wang, I Ong, ...
arXiv preprint arXiv:2311.01011, 2023
682023
Learning to model the world with language
J Lin, Y Du, O Watkins, D Hafner, P Abbeel, D Klein, A Dragan
arXiv preprint arXiv:2308.01399, 2023
482023
A strongreject for empty jailbreaks
A Souly, Q Lu, D Bowen, T Trinh, E Hsieh, S Pandey, P Abbeel, ...
arXiv preprint arXiv:2402.10260, 2024
462024
Openai o1 system card
A Jaech, A Kalai, A Lerer, A Richardson, A El-Kishky, A Low, A Helyar, ...
arXiv preprint arXiv:2412.16720, 2024
452024
Program language translation using a grammar-driven tree-to-tree model
M Drissi, O Watkins, A Khant, V Ojha, P Sandoval, R Segev, E Weiner, ...
arXiv preprint arXiv:1807.01784, 2018
242018
Hierarchical text generation using an outline
M Drissi, O Watkins, J Kalita
arXiv preprint arXiv:1810.08802, 2018
112018
Explaining robot policies
O Watkins, S Huang, J Frost, K Bhatia, E Weiner, P Abbeel, T Darrell, ...
Applied AI Letters 2 (4), e52, 2021
102021
Explaining reinforcement learning policies through counterfactual trajectories
J Frost, O Watkins, E Weiner, P Abbeel, T Darrell, B Plummer, K Saenko
arXiv preprint arXiv:2201.12462, 2022
92022
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game, November 2023
S Toyer, O Watkins, EA Mendes, J Svegliato, L Bailey, T Wang, I Ong, ...
arXiv preprint arXiv:2311.01011, 0
9
Teachable reinforcement learning via advice distillation
O Watkins, A Gupta, T Darrell, P Abbeel, J Andreas
Advances in Neural Information Processing Systems 34, 6920-6933, 2021
42021
Towards Agents Which Can Understand Rich Communication
O Watkins
University of California, Berkeley, 2024
2024
How to Evaluate Jailbreak Methods: A Case Study With the StrongREJECT Benchmark The paper in question claimed an impressive 43% success rate in jailbreaking GPT-4 by …
D Bowen, S Emmons, A Souly, Q Lu, T Trinh, E Hsieh, S Pandey, ...
يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.
مقالات 1–17