Segui
Bertie Vidgen
Bertie Vidgen
Oxford, Turing
Email verificata su rewire.online
Titolo
Citata da
Citata da
Anno
Dynabench: Rethinking benchmarking in NLP
D Kiela, M Bartolo, Y Nie, D Kaushik, A Geiger, Z Wu, B Vidgen, G Prasad, ...
arXiv preprint arXiv:2104.14337, 2021
4252021
Directions in abusive language training data, a systematic review: Garbage in, garbage out
B Vidgen, L Derczynski
Plos one 15 (12), e0243300, 2020
3502020
HateCheck: Functional tests for hate speech detection models
P Röttger, B Vidgen, D Nguyen, Z Waseem, H Margetts, JB Pierrehumbert
arXiv preprint arXiv:2012.15606, 2020
2722020
Learning from the worst: Dynamically generated datasets to improve online hate detection
B Vidgen, T Thrush, Z Waseem, D Kiela
arXiv preprint arXiv:2012.15761, 2020
2602020
Trustllm: Trustworthiness in large language models
Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li, C Gao, Y Huang, W Lyu, ...
arXiv preprint arXiv:2401.05561, 2024
2462024
Challenges and frontiers in abusive content detection
B Vidgen, A Harris, D Nguyen, R Tromble, S Hale, H Margetts
Proceedings of the third workshop on abusive language online, 2019
2322019
Detecting weak and strong Islamophobic hate speech on social media
B Vidgen, T Yasseri
Journal of Information Technology & Politics 17 (1), 66-78, 2020
2092020
Two contrasting data annotation paradigms for subjective NLP tasks
P Röttger, B Vidgen, D Hovy, JB Pierrehumbert
arXiv preprint arXiv:2112.07475, 2021
1652021
P-Values: Misunderstood and Misused
B Vidgen, T Yasseri
Frontiers in Physics 4, 6, 2016
1572016
Semeval-2023 task 10: Explainable detection of online sexism
HR Kirk, W Yin, B Vidgen, P Röttger
arXiv preprint arXiv:2303.04222, 2023
1362023
Xstest: A test suite for identifying exaggerated safety behaviours in large language models
P Röttger, HR Kirk, B Vidgen, G Attanasio, F Bianchi, D Hovy
arXiv preprint arXiv:2308.01263, 2023
1252023
An expert annotated dataset for the detection of online misogyny
E Guest, B Vidgen, A Mittos, N Sastry, G Tyson, H Margetts
Proceedings of the 16th conference of the European chapter of the …, 2021
1202021
Detecting East Asian prejudice on social media
B Vidgen, A Botelho, D Broniatowski, E Guest, M Hall, H Margetts, ...
arXiv preprint arXiv:2005.03909, 2020
1162020
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback
HR Kirk, B Vidgen, P Röttger, SA Hale
arXiv preprint arXiv:2303.05453, 2023
1002023
Introducing CAD: the contextual abuse dataset
B Vidgen, D Nguyen, H Margetts, P Rossini, R Tromble
992021
The benefits, risks and bounds of personalizing the alignment of large language models to individuals
HR Kirk, B Vidgen, P Röttger, SA Hale
Nature Machine Intelligence, 1-10, 2024
762024
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
HR Kirk, B Vidgen, P Röttger, T Thrush, SA Hale
arXiv preprint arXiv:2108.05921, 2021
632021
Multilingual HateCheck: Functional tests for multilingual hate speech detection models
P Röttger, H Seelawi, D Nozza, Z Talat, B Vidgen
arXiv preprint arXiv:2206.09917, 2022
612022
Understanding RT’s audiences: Exposure not endorsement for Twitter followers of Russian state-sponsored media
R Crilley, M Gillespie, B Vidgen, A Willis
The International Journal of Press/Politics 27 (1), 220-242, 2022
602022
How much online abuse is there
B Vidgen, H Margetts, A Harris
Alan Turing Institute 11, 2019
582019
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–20