Reconstructing the mind's eye: fmri-to-image with contrastive learning and diffusion priors P Scotti, A Banerjee, J Goode, S Shabalin, A Nguyen, A Dempster, ... Advances in Neural Information Processing Systems 36, 2024 | 98 | 2024 |
Competition report: Finding universal jailbreak backdoors in aligned llms J Rando, F Croce, K Mitka, S Shabalin, M Andriushchenko, N Flammarion, ... arXiv preprint arXiv:2404.14461, 2024 | 13 | 2024 |
Error correcting htr’ed byzantine text J Pavlopoulos, V Kougia, P Platanou, S Shabalin, K Liagkou, ... | 6* | 2023 |
Self-explaining SAE features S Shabalin, D Kharlapenko, N Nanda, A Conmy AI Alignment Forum, 2024 | 4* | 2024 |
Transcoders Beat Sparse Autoencoders for Interpretability G Paulo, S Shabalin, N Belrose arXiv preprint arXiv:2501.18823, 2025 | | 2025 |
Scaling Sparse Feature Circuits For Studying In-Context Learning D Kharlapenko, S Shabalin, F Barez, N Nanda, A Conmy | | |