Obserwuj
Rylan Schaeffer
Tytuł
Cytowane przez
Cytowane przez
Rok
Are emergent abilities of Large Language Models a mirage?
R Schaeffer, B Miranda, S Koyejo
Advances in Neural Information Processing Systems, 2023
4722023
Decodingtrust: A comprehensive assessment of trustworthiness in gpt models
B Wang, W Chen, H Pei, C Xie, M Kang, C Zhang, C Xu, Z Xiong, R Dutta, ...
Advances in Neural Information Processing Systems (Datasets & Benchmarks Track), 2023
3932023
Many-shot jailbreaking
C Anil, E Durmus, N Rimsky, M Sharma, J Benton, S Kundu, J Batson, ...
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024
852024
No free lunch from deep learning in neuroscience: A case study through models of the entorhinal-hippocampal circuit
R Schaeffer, M Khona, I Fiete
Advances in Neural Information Processing Systems, 2022
682022
Investigating data contamination for pre-training language models
M Jiang, KZ Liu, M Zhong, R Schaeffer, S Ouyang, J Han, S Koyejo
arXiv preprint arXiv:2401.06059, 2024
47*2024
Reverse-engineering recurrent neural network solutions to a hierarchical inference task for mice
R Schaeffer, M Khona, L Meshulam, IR Fiete
Advances in Neural Information Processing Systems, 2020
392020
Is model collapse inevitable? breaking the curse of recursion by accumulating real and synthetic data
M Gerstgrasser*, R Schaeffer*, A Dey*, R Rafailov*, H Sleight, J Hughes, ...
arXiv preprint arXiv:2404.01413, 2024
352024
Double descent demystified: Identifying, interpreting & ablating the sources of a deep learning puzzle
R Schaeffer, M Khona, Z Robertson, A Boopathy, K Pistunova, JW Rocks, ...
arXiv preprint arXiv:2303.14151, 2023
282023
A brain-wide map of neural activity during complex behaviour
International Brain Laboratory, B Benson, J Benson, D Birman, ...
biorxiv, 2023.07. 04.547681, 2023
262023
Open problems in technical ai governance
A Reuel, B Bucknall, S Casper, T Fist, L Soder, O Aarne, L Hammond, ...
arXiv preprint arXiv:2407.14981, 2024
242024
Brain-wide representations of prior information in mouse decision-making
C Findling, F Hubert, International Brain Laboratory, L Acerbi, B Benson, ...
BioRxiv, 2023.07. 04.547684, 2023
242023
Pretraining on the test set is all you need
R Schaeffer
arXiv preprint arXiv:2309.08632, 2023
182023
Self-Supervised Learning of Representations for Space Generates Multi-Modular Grid Cells
R Schaeffer, M Khona, T Ma, C Eyzaguirre, S Koyejo, IR Fiete
Advances in Neural Information Processing Systems (NeurIPS), 2023
172023
Deceptive alignment monitoring
A Carranza, D Pai, R Schaeffer, A Tandon, S Koyejo
ICML 2023 Workshop: Adversarial Machine Learning Frontiers, 2023
132023
Emergence of sparse representations from noise
T Bricken, R Schaeffer, B Olshausen, G Kreiman
102023
Quantifying Variance in Evaluation Benchmarks
L Madaan, AK Singh, R Schaeffer, A Poulton, S Koyejo, P Stenetorp, ...
arXiv preprint arXiv:2406.10229, 2024
92024
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
R Schaeffer, H Schoelkopf, B Miranda, G Mukobi, V Madan, A Ibrahim, ...
arXiv preprint arXiv:2406.04391, 2024
92024
Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
R Schaeffer, D Valentine, L Bailey, J Chua, C Eyzaguirre, Z Durante, ...
arXiv preprint arXiv:2407.15211, 2024
6*2024
Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models
S Duan, M Khona, A Iyer, R Schaeffer, IR Fiete
arXiv preprint arXiv:2406.14549, 2024
52024
What Causes Polysemanticity? An Alternative Origin Story of Mixed Selectivity from Incidental Causes
V Lecomte, K Thaman, R Schaeffer, N Bashkansky, T Chow, S Koyejo
arXiv preprint arXiv:2312.03096, 2024
5*2024
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–20