Gpt-4 technical report J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ... arXiv preprint arXiv:2303.08774, 2023 | 9678* | 2023 |
A holistic approach to undesired content detection in the real world T Markov, C Zhang, S Agarwal, FE Nekoul, T Lee, S Adler, A Jiang, ... Proceedings of the AAAI Conference on Artificial Intelligence 37 (12), 15009 …, 2023 | 199 | 2023 |
Gpt-4 technical report, 2024 JA OpenAI, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, ... URL https://arxiv. org/abs/2303.08774 2, 6, 2024 | 84 | 2024 |
What are you optimizing for? aligning recommender systems with human values J Stray, I Vendrov, J Nixon, S Adler, D Hadfield-Menell arXiv preprint arXiv:2107.10939, 2021 | 83 | 2021 |
Practices for governing agentic AI systems Y Shavit, S Agarwal, M Brundage, S Adler, C O’Keefe, R Campbell, T Lee, ... Research Paper, OpenAI, December, 2023 | 57 | 2023 |
Lessons learned on language model safety and misuse M Brundage, K Mayer, T Eloundou, S Agarwal, S Adler, G Krueger, ... OpenAI, 2022 | 23 | 2022 |
New and improved content moderation tooling T Markov, C Zhang, S Agarwal, T Eloundou, T Lee, S Adler, A Jiang, ... OpenAI.< https://openai. com/blog/new-andimproved-content-moderation-tooling …, 2022 | 19 | 2022 |
When is it appropriate to publish high-stakes AI research C Leibowicz, S Adler, P Eckersley Partnership on AI blog post, 2019 | 12 | 2019 |
Personhood credentials: Artificial intelligence and the value of privacy-preserving tools to distinguish who is real online S Adler, Z Hitzig, S Jain, C Brewer, W Chang, R DiResta, E Lazzarin, ... arXiv preprint arXiv:2408.07892, 2024 | 5 | 2024 |
Large language models as misleading assistants in conversation BL Hou, K Shi, J Phang, J Aung, S Adler, R Campbell arXiv preprint arXiv:2407.11789, 2024 | 2 | 2024 |
Systems and methods for language model-based content classification T Markov, C Zhang, S Agarwal, FME NEKOUL, T Lee, S Adler, A Jiang, ... US Patent App. 18/308,586, 2024 | | 2024 |