Foundational challenges in assuring alignment and safety of large language models U Anwar, A Saparov, J Rando, D Paleka, M Turpin, P Hase, ES Lubana, ... Transactions on Machine Learning Research (TMLR), 2024 | 136 | 2024 |
Language Model Tokenizers Introduce Unfairness Between Languages A Petrov, E La Malfa, PHS Torr, A Bibi Conference on Neural Information Processing Systems (NeurIPS) 2023, 2023 | 101 | 2023 |
When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations A Petrov, PHS Torr, A Bibi International Conference on Learning Representations (ICLR) 2024, 2023 | 31 | 2023 |
Near to Mid-term Risks and Opportunities of Open Source Generative AI F Eiras, A Petrov, B Vidgen, CS de Witt, F Pizzati, K Elkins, ... International Conference on Machine Learning (ICML) 2024, 2024 | 24 | 2024 |
Language-Models-as-a-Service: Overview of a New Paradigm and its Challenges E La Malfa, A Petrov, S Frieder, C Weinhuber, R Burnell, R Nazar, A Cohn, ... Journal of Artificial Intelligence Research 80, 1497-1523, 2024 | 20* | 2024 |
Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents J Tani, AF Daniele, G Bernasconi, A Camus, A Petrov, A Courchesne, ... 2020 IEEE International Conference on Intelligent Robots and Systems (IROS), 2020 | 17 | 2020 |
Learning Camera Miscalibration Detection A Cramariuc, A Petrov, R Suri, M Mittal, R Siegwart, C Cadena 2020 IEEE International Conference on Robotics and Automation (ICRA), 4997-5003, 2020 | 16 | 2020 |
Prompting a Pretrained Transformer Can Be a Universal Approximator A Petrov, PHS Torr, A Bibi International Conference on Machine Learning (ICML) 2024, 2024 | 11 | 2024 |
Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Models F Eiras, A Petrov, P Torr, MP Kumar, A Bibi International Conference on Learning Representations (ICLR) 2025, 2024 | 5* | 2024 |
Certifying Ensembles: A General Certification Theory with S-Lipschitzness A Petrov, F Eiras, A Sanyal, PHS Torr, A Bibi International Conference on Machine Learning (ICML) 2023, 2023 | 2 | 2023 |
HiddenGems: Efficient safety boundary detection with active learning A Petrov, C Fang, KM Pham, YH Eng, JGM Fu, SD Pendleton 2022 IEEE International Conference on Intelligent Robots and Systems (IROS …, 2022 | 2 | 2022 |
Universal In-Context Approximation By Prompting Fully Recurrent Models A Petrov, TA Lamb, A Paren, PHS Torr, A Bibi Conference on Neural Information Processing Systems (NeurIPS) 2024, 2024 | 1 | 2024 |
Search Algorithms and Safety Verification for Compliant Domain Volumes SD Pendleton, AP Petrov, CTY Fang, MK Pham, JGM Fu, YH Eng US Patent App. 17/941,712, 2023 | 1 | 2023 |
Robustness of Unsupervised Representation Learning without Labels A Petrov, M Kwiatkowska arXiv preprint arXiv:2210.04076, 2022 | 1 | 2022 |
Optimizing Multi-rendezvous Spacecraft Trajectories: Matrices and Sequence Selection A Petrov, R Noomen arXiv preprint arXiv:2011.06617, 2020 | 1 | 2020 |
On the Coexistence and Ensembling of Watermarks A Petrov, S Agarwal, PHS Torr, A Bibi, J Collomosse arXiv preprint arXiv:2501.17356, 2025 | | 2025 |
Risks and Opportunities of Open-Source Generative AI F Eiras, A Petrov, B Vidgen, C Schroeder, F Pizzati, K Elkins, ... arXiv preprint arXiv:2405.08597, 2024 | | 2024 |
Compositional Computational Systems A Petrov ETH Zurich, 2020 | | 2020 |