Language Models Linearly Represent Sentiment C Tigges, O Hollinsworth, A Geiger, N Nanda Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting …, 2024 | 45* | 2024 |
Transformer-based models are not yet perfect at learning to emulate structural recursion D Zhang, C Tigges, Z Zhang, S Biderman, M Raginsky, T Ringer arXiv preprint arXiv:2401.12947, 2024 | 20* | 2024 |
GPT-NeoX: Large Scale Autoregressive Language Modeling in Py-Torch, 9 2023 A Andonian, Q Anthony, S Biderman, S Black, P Gali, L Gao, E Hallahan, ... URL https://www. github. com/eleutherai/gpt-neox, 0 | 7 | |
LLM circuit analyses are consistent across training and scale C Tigges, M Hanna, Q Yu, S Biderman arXiv preprint arXiv:2407.10827, 2024 | 4 | 2024 |
Stitching Sparse Autoencoders of Different Sizes P Leask, B Bussmann, JI Bloom, C Tigges, N Al Moubayed, N Nanda NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning, 0 | 1 | |