When not to trust language models: Investigating effectiveness of parametric and non-parametric memories A Mallen, A Asai, V Zhong, R Das, D Khashabi, H Hajishirzi arXiv preprint arXiv:2212.10511, 2022 | 396 | 2022 |
Representation engineering: A top-down approach to ai transparency A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ... arXiv preprint arXiv:2310.01405, 2023 | 283 | 2023 |
When not to trust language models: Investigating effectiveness and limitations of parametric and non-parametric memories A Mallen, A Asai, V Zhong, R Das, H Hajishirzi, D Khashabi arXiv preprint arXiv:2212.10511 7, 2022 | 89 | 2022 |
Representation engineering: A top-down approach to AI transparency. CoRR, abs/2310.01405, 2023. doi: 10.48550 A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ... arXiv preprint ARXIV.2310.01405, 0 | 20 | |
Representation engineering: A top-down approach to ai transparency, 2023 A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ... URL https://arxiv. org/abs/2310.01405, 0 | 12 | |
Eliciting latent knowledge from quirky language models A Mallen, M Brumley, J Kharchenko, N Belrose arXiv preprint arXiv:2312.01037, 2023 | 9 | 2023 |
Automatically interpreting millions of features in large language models G Paulo, A Mallen, C Juang, N Belrose arXiv preprint arXiv:2410.13928, 2024 | 7 | 2024 |
Neural Networks Learn Statistics of Increasing Complexity N Belrose, Q Pope, L Quirke, A Mallen, X Fern arXiv preprint arXiv:2402.04362, 2024 | 6 | 2024 |
Deep probabilistic Koopman: long-term time-series forecasting under periodic uncertainties AT Mallen, H Lange, JN Kutz International Journal of Forecasting 40 (3), 859-868, 2024 | 5 | 2024 |
Balancing Label Quantity and Quality for Scalable Elicitation A Mallen, N Belrose arXiv preprint arXiv:2410.13215, 2024 | 3 | 2024 |
The Synaptic Architecture of Layer 5 Thick Tufted Excitatory Neurons in the Visual Cortex of Mice AL Bodor, CM Schneider-Mizell, C Zhang, L Elabbady, A Mallen, ... bioRxiv, 2023.10. 18.562531, 2023 | 2 | 2023 |
Koopman-inspired approach for identification of exogenous anomalies in nonstationary time-series data A Mallen, CA Keller, JN Kutz Machine Learning: Science and Technology 4 (2), 025033, 2023 | 2 | 2023 |
Subversion Strategy Eval: Evaluating AI's stateless strategic capabilities against control protocols A Mallen, C Griffin, A Abate, B Shlegeris arXiv preprint arXiv:2412.12480, 2024 | | 2024 |
The Synaptic Architecture of Layer 5 Thick Tufted Excitatory Neurons in the Visual Cortex of Mice NM da Costa, A Bodor, C Schneider-Mizell, C Zhang, L Elabbady, ... | | 2024 |
Robust Unsupervised Mining of Dense Sub-Graphs at Multiple Resolutions N Gupta, G Gupta, J Ghosh, S Shankar, A Mallen | | 2020 |
Enhancing Neural Network Transparency through Representation Analysis A Zou, L Phan, SL Chen, J Campbell, PH Guo, R Ren, A Pan, X Yin, ... | | |