팔로우
Alex Mallen
Alex Mallen
Redwood Research
rdwrs.com의 이메일 확인됨 - 홈페이지
제목
인용
인용
연도
When not to trust language models: Investigating effectiveness of parametric and non-parametric memories
A Mallen, A Asai, V Zhong, R Das, D Khashabi, H Hajishirzi
arXiv preprint arXiv:2212.10511, 2022
3962022
Representation engineering: A top-down approach to ai transparency
A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ...
arXiv preprint arXiv:2310.01405, 2023
2832023
When not to trust language models: Investigating effectiveness and limitations of parametric and non-parametric memories
A Mallen, A Asai, V Zhong, R Das, H Hajishirzi, D Khashabi
arXiv preprint arXiv:2212.10511 7, 2022
892022
Representation engineering: A top-down approach to AI transparency. CoRR, abs/2310.01405, 2023. doi: 10.48550
A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ...
arXiv preprint ARXIV.2310.01405, 0
20
Representation engineering: A top-down approach to ai transparency, 2023
A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ...
URL https://arxiv. org/abs/2310.01405, 0
12
Eliciting latent knowledge from quirky language models
A Mallen, M Brumley, J Kharchenko, N Belrose
arXiv preprint arXiv:2312.01037, 2023
92023
Automatically interpreting millions of features in large language models
G Paulo, A Mallen, C Juang, N Belrose
arXiv preprint arXiv:2410.13928, 2024
72024
Neural Networks Learn Statistics of Increasing Complexity
N Belrose, Q Pope, L Quirke, A Mallen, X Fern
arXiv preprint arXiv:2402.04362, 2024
62024
Deep probabilistic Koopman: long-term time-series forecasting under periodic uncertainties
AT Mallen, H Lange, JN Kutz
International Journal of Forecasting 40 (3), 859-868, 2024
52024
Balancing Label Quantity and Quality for Scalable Elicitation
A Mallen, N Belrose
arXiv preprint arXiv:2410.13215, 2024
32024
The Synaptic Architecture of Layer 5 Thick Tufted Excitatory Neurons in the Visual Cortex of Mice
AL Bodor, CM Schneider-Mizell, C Zhang, L Elabbady, A Mallen, ...
bioRxiv, 2023.10. 18.562531, 2023
22023
Koopman-inspired approach for identification of exogenous anomalies in nonstationary time-series data
A Mallen, CA Keller, JN Kutz
Machine Learning: Science and Technology 4 (2), 025033, 2023
22023
Subversion Strategy Eval: Evaluating AI's stateless strategic capabilities against control protocols
A Mallen, C Griffin, A Abate, B Shlegeris
arXiv preprint arXiv:2412.12480, 2024
2024
The Synaptic Architecture of Layer 5 Thick Tufted Excitatory Neurons in the Visual Cortex of Mice
NM da Costa, A Bodor, C Schneider-Mizell, C Zhang, L Elabbady, ...
2024
Robust Unsupervised Mining of Dense Sub-Graphs at Multiple Resolutions
N Gupta, G Gupta, J Ghosh, S Shankar, A Mallen
2020
Enhancing Neural Network Transparency through Representation Analysis
A Zou, L Phan, SL Chen, J Campbell, PH Guo, R Ren, A Pan, X Yin, ...
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–16