Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 1146 | 2021 |
Improving language models by retrieving from trillions of tokens S Borgeaud, A Mensch, J Hoffmann, T Cai, E Rutherford, K Millican, ... International conference on machine learning, 2206-2240, 2022 | 1108 | 2022 |
Red teaming language models with language models E Perez, S Huang, F Song, T Cai, R Ring, J Aslanides, A Glaese, ... arXiv preprint arXiv:2202.03286, 2022 | 608 | 2022 |
Generative AI and the digital commons S Huang, D Siddarth arXiv preprint arXiv:2303.11074, 2023 | 82 | 2023 |
Using the Veil of Ignorance to align AI systems with principles of justice L Weidinger, KR McKee, R Everett, S Huang, TO Zhu, MJ Chadwick, ... Proceedings of the National Academy of Sciences 120 (18), e2213709120, 2023 | 34 | 2023 |
Collective constitutional ai: Aligning a language model with public input S Huang, D Siddarth, L Lovitt, TI Liao, E Durmus, A Tamkin, D Ganguli Proceedings of the 2024 ACM Conference on Fairness, Accountability, and …, 2024 | 25 | 2024 |
How large language models can reshape collective intelligence JW Burton, E Lopez-Lopez, S Hechtlinger, Z Rahwan, S Aeschbach, ... Nature human behaviour 8 (9), 1643-1655, 2024 | 16 | 2024 |
Beyond static AI evaluations: advancing human interaction evaluations for LLM harms and risks L Ibrahim, S Huang, L Ahmad, M Anderljung arXiv preprint arXiv:2405.10632, 2024 | 12 | 2024 |
Collective constitutional ai: Aligning a language model with public input D Ganguli, S Huang, L Lovitt, D Siddarth, E Durmus, T Liao, A Askell, ... Accessed on February 10, 2024, 2023 | 10 | 2023 |
Evaluating feature steering: A case study in mitigating social biases, 2024 E Durmus, A Tamkin, J Clark, J Wei, J Marcus, J Batson, K Handa, L Lovitt, ... URL https://anthropic. com/research/evaluating-feature-steering, 0 | 6 | |
A Departure from Truth S Huang Harvard Political Review, 2016 | 4 | 2016 |
How will advanced AI systems impact democracy? C Summerfield, L Argyle, M Bakker, T Collins, E Durmus, T Eloundou, ... arXiv preprint arXiv:2409.06729, 2024 | 2 | 2024 |
Clio: Privacy-Preserving Insights into Real-World AI Use A Tamkin, M McCain, K Handa, E Durmus, L Lovitt, A Rathi, S Huang, ... arXiv preprint arXiv:2412.13678, 2024 | | 2024 |
Control and Consciousness of Time S Huang | | 2023 |
Bi-Level Multi-Agent Reinforcement Learning for Intervening in Intertemporal Social Dilemmas S Huang Harvard University, 2021 | | 2021 |
Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations K Handa, A Tamkin, M McCain, S Huang, E Durmus, S Heck, J Mueller, ... | | |