Hate speech classifiers learn normative social stereotypes

AM Davani, M Atari, B Kennedy… - Transactions of the …, 2023 - direct.mit.edu
Social stereotypes negatively impact individuals' judgments about different groups and may
have a critical role in understanding language directed toward marginalized groups. Here …

Robustness of models addressing Information Disorder: A comprehensive review and benchmarking study

G Fenza, V Loia, C Stanzione, M Di Gisi - Neurocomputing, 2024 - Elsevier
Abstract Machine learning and deep learning models are increasingly susceptible to
adversarial attacks, particularly in critical areas like cybersecurity and Information Disorder …

[PDF][PDF] The state of profanity obfuscation in natural language processing scientific publications

D Nozza, D Hovy - Proceedings of the Annual Meeting of the …, 2023 - iris.unibocconi.it
Work on hate speech has made considering rude and harmful examples in scientific
publications inevitable. This situation raises various problems, such as whether or not to …

A systematic review of toxicity in large language models: Definitions, datasets, detectors, detoxification methods and challenges

G Villate-Castillo, J Del Ser, BS Urquijo - 2024 - researchsquare.com
The emergence of the transformer architecture has ushered in a new era of possibilities,
showcasing remarkable capabilities in generative tasks exemplified by models like GPT4o …

Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?

F Vargas, I Carvalho, A Hürriyetoğlu… - Proceedings of the …, 2023 - aclanthology.org
Recent studies have shown that hate speech technologies may propagate social
stereotypes against marginalized groups. Nevertheless, there has been a lack of realistic …

Are shortest rationales the best explanations for human understanding?

H Shen, T Wu, W Guo, THK Huang - ar**s
M Ge, R Mao, E Cambria - Cognitive Computation, 2025 - Springer
With the prosperity of social media, toxic language spreading over social media has become
an unignorable challenge for individual mental health and social harmony. Many …

Studying the influence of toxicity and emotion features for stress detection on social media

Z Alghamdi, T Kumarage, M Karami… - ECSM 2023 10th …, 2023 - books.google.com
It is crucial to detect and manage stress as early as possible before it becomes a severe
mental and physical health problem. Some authors even introduce stress as a “silent killer” …

VertAttack: Taking advantage of Text Classifiers' horizontal vision

J Rusert - arxiv preprint arxiv:2404.08538, 2024 - arxiv.org
Text classification systems have continuously improved in performance over the years.
However, nearly all current SOTA classifiers have a similar shortcoming, they process text in …