Saffron Huang

Citace

	Všechny	Od 2020
Citace	3053	3049
h-index	9	9
i10-index	9	9

1600

800

400

1200

2021202220232024202512 363 930 1541 193

Veřejný přístup

Zobrazit všechny

1 článek

0 článků

dostupné

nedostupné

Vychází ze zplnomocnění pro financování

Sledovat

Saffron Huang

Anthropic

E-mailová adresa ověřena na: anthropic.com - Domovská stránka


Název Seřadit podle citací Seřadit podle roku Seřadit podle názvu	Citace Citace	Rok
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021	1146	2021
Improving language models by retrieving from trillions of tokens S Borgeaud, A Mensch, J Hoffmann, T Cai, E Rutherford, K Millican, ... International conference on machine learning, 2206-2240, 2022	1108	2022
Red teaming language models with language models E Perez, S Huang, F Song, T Cai, R Ring, J Aslanides, A Glaese, ... arXiv preprint arXiv:2202.03286, 2022	608	2022
Generative AI and the digital commons S Huang, D Siddarth arXiv preprint arXiv:2303.11074, 2023	82	2023
Using the Veil of Ignorance to align AI systems with principles of justice L Weidinger, KR McKee, R Everett, S Huang, TO Zhu, MJ Chadwick, ... Proceedings of the National Academy of Sciences 120 (18), e2213709120, 2023	34	2023
Collective constitutional ai: Aligning a language model with public input S Huang, D Siddarth, L Lovitt, TI Liao, E Durmus, A Tamkin, D Ganguli Proceedings of the 2024 ACM Conference on Fairness, Accountability, and …, 2024	25	2024
How large language models can reshape collective intelligence JW Burton, E Lopez-Lopez, S Hechtlinger, Z Rahwan, S Aeschbach, ... Nature human behaviour 8 (9), 1643-1655, 2024	16	2024
Beyond static AI evaluations: advancing human interaction evaluations for LLM harms and risks L Ibrahim, S Huang, L Ahmad, M Anderljung arXiv preprint arXiv:2405.10632, 2024	12	2024
Collective constitutional ai: Aligning a language model with public input D Ganguli, S Huang, L Lovitt, D Siddarth, E Durmus, T Liao, A Askell, ... Accessed on February 10, 2024, 2023	10	2023
Evaluating feature steering: A case study in mitigating social biases, 2024 E Durmus, A Tamkin, J Clark, J Wei, J Marcus, J Batson, K Handa, L Lovitt, ... URL https://anthropic. com/research/evaluating-feature-steering, 0	6
A Departure from Truth S Huang Harvard Political Review, 2016	4	2016
How will advanced AI systems impact democracy? C Summerfield, L Argyle, M Bakker, T Collins, E Durmus, T Eloundou, ... arXiv preprint arXiv:2409.06729, 2024	2	2024
Clio: Privacy-Preserving Insights into Real-World AI Use A Tamkin, M McCain, K Handa, E Durmus, L Lovitt, A Rathi, S Huang, ... arXiv preprint arXiv:2412.13678, 2024		2024
Control and Consciousness of Time S Huang		2023
Bi-Level Multi-Agent Reinforcement Learning for Intervening in Intertemporal Social Dilemmas S Huang Harvard University, 2021		2021
Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations K Handa, A Tamkin, M McCain, S Huang, E Durmus, S Heck, J Mueller, ...

Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.

Články 1–16

Citace za rok

Duplicitní citace

Sloučené citace

Přidat spoluautorySpoluautoři

Sledovat

Citace