Resources and benchmark corpora for hate speech detection: a systematic review

F Poletto, V Basile, M Sanguinetti, C Bosco… - Language Resources …, 2021 - Springer
Hate Speech in social media is a complex phenomenon, whose detection has recently
gained significant traction in the Natural Language Processing community, as attested by …

A literature review of textual hate speech detection methods and datasets

F Alkomah, X Ma - Information, 2022 - mdpi.com
Online toxic discourses could result in conflicts between groups or harm to online
communities. Hate speech is complex and multifaceted harmful or offensive content …

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arxiv preprint arxiv …, 2022 - arxiv.org
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Evaluating the social impact of generative ai systems in systems and society

I Solaiman, Z Talat, W Agnew, L Ahmad… - arxiv preprint arxiv …, 2023 - arxiv.org
Generative AI systems across modalities, ranging from text, image, audio, and video, have
broad social impacts, but there exists no official standard for means of evaluating those …

On the dangers of stochastic parrots: Can language models be too big?🦜

EM Bender, T Gebru, A McMillan-Major… - Proceedings of the 2021 …, 2021 - dl.acm.org
The past 3 years of work in NLP have been characterized by the development and
deployment of ever larger language models, especially for English. BERT, its variants, GPT …

Dealing with disagreements: Looking beyond the majority vote in subjective annotations

AM Davani, M Díaz, V Prabhakaran - Transactions of the Association …, 2022 - direct.mit.edu
Majority voting and averaging are common approaches used to resolve annotator
disagreements and derive single ground truth labels from multiple annotations. However …

Into the laion's den: Investigating hate in multimodal datasets

A Birhane, S Han, V Boddeti… - Advances in Neural …, 2024 - proceedings.neurips.cc
AbstractScale the model, scale the data, scale the compute'is the reigning sentiment in the
world of generative AI today. While the impact of model scaling has been extensively …

Dynabench: Rethinking benchmarking in NLP

D Kiela, M Bartolo, Y Nie, D Kaushik, A Geiger… - arxiv preprint arxiv …, 2021 - arxiv.org
We introduce Dynabench, an open-source platform for dynamic dataset creation and model
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …

[PDF][PDF] XHate-999: Analyzing and detecting abusive language across domains and languages

G Glavaš, M Karan, I Vulić - 2020 - madoc.bib.uni-mannheim.de
We present XHATE-999, a multi-domain and multilingual evaluation data set for abusive
language detection. By aligning test instances across six typologically diverse languages …