Deep Ganguli

Cited by

	All	Since 2020
Citations	8873	8559
h-index	24	24
i10-index	27	27

6000

3000

1500

4500

2015201620172018201920202021202220232024202537 58 48 66 49 83 164 508 2280 5104 379

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Deep Ganguli

Anthropic

Verified email at cns.nyu.edu - Homepage

Machine Learning Neuroscience


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Training a helpful and harmless assistant with reinforcement learning from human feedback Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma, D Drain, ... arXiv preprint arXiv:2204.05862, 2022	1622	2022
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022	1291	2022
Constitutional ai: Harmlessness from ai feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022	1260	2022
The AI index 2021 annual report D Zhang, S Mishra, E Brynjolfsson, J Etchemendy, D Ganguli, B Grosz, ... arXiv preprint arXiv:2103.06312, 2021	722	2021
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ... arXiv preprint arXiv:2209.07858, 2022	465	2022
A general language assistant as a laboratory for alignment A Askell, Y Bai, A Chen, D Drain, D Ganguli, T Henighan, A Jones, ... arXiv preprint arXiv:2112.00861, 2021	386	2021
A mathematical framework for transformer circuits N Elhage, N Nanda, C Olsson, T Henighan, N Joseph, B Mann, A Askell, ... Transformer Circuits Thread 1 (1), 12, 2021	380*	2021
In-context learning and induction heads C Olsson, N Elhage, N Nanda, N Joseph, N DasSarma, T Henighan, ... arXiv preprint arXiv:2209.11895, 2022	375	2022
Understanding the Capabilities Limitations, and Societal Impact of Large Language Models A Tamkin, M Brundage, J Clark, D Ganguli arXiv preprint arxiv:2102.02503, 2021	356	2021
Predictability and surprise in large generative models D Ganguli, D Hernandez, L Lovitt, A Askell, Y Bai, A Chen, T Conerly, ... Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022	298	2022
Efficient sensory encoding and Bayesian inference with heterogeneous neural populations D Ganguli, EP Simoncelli Neural computation 26 (10), 2103-2134, 2014	259	2014
Druid: A real-time analytical data store F Yang, E Tschetter, X Léauté, N Ray, G Merlino, D Ganguli Proceedings of the 2014 ACM SIGMOD international conference on Management of …, 2014	245	2014
Discovering language model behaviors with model-written evaluations E Perez, S Ringer, K Lukošiūtė, K Nguyen, E Chen, S Heiner, C Pettit, ... arXiv preprint arXiv:2212.09251, 2022	242	2022
Language models (mostly) know what they know S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ... arXiv preprint arXiv:2207.05221, 2022	178	2022
Towards measuring the representation of subjective global opinions in language models E Durmus, K Nyugen, TI Liao, N Schiefer, A Askell, A Bakhtin, C Chen, ... arXiv preprint arXiv:2306.16388, 2023	163	2023
The capacity for moral self-correction in large language models D Ganguli, A Askell, N Schiefer, TI Liao, K Lukošiūtė, A Chen, A Goldie, ... arXiv preprint arXiv:2302.07459, 2023	142	2023
Implicit encoding of prior probabilities in optimal neural populations D Ganguli, E Simoncelli Advances in neural information processing systems 23, 2010	115	2010
Many-shot jailbreaking C Anil, E Durmus, N Rimsky, M Sharma, J Benton, S Kundu, J Batson, ... The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024	79	2024
Sleeper agents: Training deceptive llms that persist through safety training E Hubinger, C Denison, J Mu, M Lambert, M Tong, M MacDiarmid, ... arXiv preprint arXiv:2401.05566, 2024	68	2024
Evaluating and mitigating discrimination in language model decisions A Tamkin, A Askell, L Lovitt, E Durmus, N Joseph, S Kravec, K Nguyen, ... arXiv preprint arXiv:2312.03689, 2023	49	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by