Constitutional AI: Harmlessness from AI Feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022 | 1260 | 2022 |
Discovering Language Model Behaviors with Model-Written Evaluations E Perez, S Ringer, K Lukošiūtė, K Nguyen, E Chen, S Heiner, C Pettit, ... arXiv preprint arXiv:2212.09251, 2022 | 242 | 2022 |
The capacity for moral self-correction in large language models D Ganguli, A Askell, N Schiefer, TI Liao, K Lukošiūtė, A Chen, A Goldie, ... arXiv preprint arXiv:2302.07459, 2023 | 142 | 2023 |
Studying Large Language Model Generalization with Influence Functions R Grosse, J Bae, C Anil, N Elhage, A Tamkin, A Tajdini, B Steiner, D Li, ... arXiv preprint arXiv:2308.03296, 2023 | 128 | 2023 |
Measuring progress on scalable oversight for large language models SR Bowman, J Hyun, E Perez, E Chen, C Pettit, S Heiner, K Lukošiūtė, ... arXiv preprint arXiv:2211.03540, 2022 | 105 | 2022 |
Measuring Faithfulness in Chain-of-Thought Reasoning T Lanham, A Chen, A Radhakrishnan, B Steiner, C Denison, ... arXiv preprint arXiv:2307.13702, 2023 | 103 | 2023 |
The Challenges Ahead for Multimessenger Analyses of Gravitational Waves and Kilonova: A Case Study on GW190425 G Raaijmakers, S Nissanke, F Foucart, MM Kasliwal, M Bulla, ... The Astrophysical Journal 922 (2), 269, 2021 | 66 | 2021 |
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning A Radhakrishnan, K Nguyen, A Chen, C Chen, C Denison, D Hernandez, ... arXiv preprint arXiv:2307.11768, 2023 | 48 | 2023 |
Prospects of Gravitational-wave Follow-up through a Wide-field Ultraviolet Satellite: A Dorado Case Study B Dorsman, G Raaijmakers, SB Cenko, S Nissanke, LP Singer, ... The Astrophysical Journal 944 (2), 126, 2023 | 11 | 2023 |
KilonovaNet: Surrogate models of kilonova spectra with conditional variational autoencoders K Lukošiute, G Raaijmakers, Z Doctor, M Soares-Santos, B Nord Monthly Notices of the Royal Astronomical Society 516 (1), 1137-1148, 2022 | 11 | 2022 |
Error Analysis of Kilonova Surrogate Models K Lukošiute, G Raaijmakers, Z Doctor, M Soares-Santos | | |