Sledovat
Alexander Meinke
Alexander Meinke
Apollo Research
E-mailová adresa ověřena na: apolloresearch.ai
Název
Citace
Citace
Rok
Towards neural networks that provably know when they don't know
A Meinke, M Hein
arXiv preprint arXiv:1909.12180, 2019
1742019
Adversarial robustness on in-and out-distribution improves explainability
M Augustin, A Meinke, M Hein
European Conference on Computer Vision, 228-245, 2020
1062020
Certifiably adversarially robust detection of out-of-distribution data
J Bitterwolf, A Meinke, M Hein
Advances in Neural Information Processing Systems 33, 16085-16095, 2020
882020
Breaking down out-of-distribution detection: Many methods based on ood training data estimate a combination of the same core quantities
J Bitterwolf, A Meinke, M Augustin, M Hein
International conference on machine learning, 2041-2074, 2022
362022
Provably adversarially robust detection of out-of-distribution data (almost) for free
A Meinke, J Bitterwolf, M Hein
Advances in Neural Information Processing Systems 35, 30167-30180, 2022
32*2022
Me, myself, and ai: The situational awareness dataset (sad) for llms
R Laine, B Chughtai, J Betley, K Hariharan, J Scheurer, M Balesni, ...
arXiv preprint arXiv:2407.04694, 2024
192024
Network inference and maximum entropy estimation on information diagrams
EA Martin, J Hlinka, A Meinke, F Děchtěrenko, J Tintěra, I Oliver, ...
Scientific reports 7 (1), 7062, 2017
162017
Towards a situational awareness benchmark for llms
R Laine, A Meinke, O Evans
Socially responsible language modelling research, 2023
102023
Towards evaluations-based safety cases for ai scheming
M Balesni, M Hobbhahn, D Lindner, A Meinke, T Korbak, J Clymer, ...
arXiv preprint arXiv:2411.03336, 2024
62024
Frontier models are capable of in-context scheming
A Meinke, B Schoen, J Scheurer, M Balesni, R Shah, M Hobbhahn
arXiv preprint arXiv:2412.04984, 2024
52024
Tell, don't show: Declarative facts influence how LLMs generalize
A Meinke, O Evans
arXiv preprint arXiv:2312.07779, 2023
42023
Classifiers should do well even on their worst classes
J Bitterwolf, A Meinke, V Boreiko, M Hein
ICML 2022 Shift Happens Workshop, 2022
42022
Robust out-of-distribution detection in deep classifiers
A Meinke
Universität Tübingen, 2023
2023
Improving Fairness and Cybersecurity in the Artificial Intelligence Act.
G Carovano, A Meinke
EWAF, 2023
2023
Applications of the Extremal Functional Bootstrap
A Meinke
Universidade de São Paulo, 2018
2018
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–15