Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Hate speech classifiers learn normative social stereotypes
Social stereotypes negatively impact individuals' judgments about different groups and may
have a critical role in understanding language directed toward marginalized groups. Here …
have a critical role in understanding language directed toward marginalized groups. Here …
Annotators with attitudes: How annotator beliefs and identities bias toxic language detection
The perceived toxicity of language can vary based on someone's identity and beliefs, but
this variation is often ignored when collecting toxic language datasets, resulting in dataset …
this variation is often ignored when collecting toxic language datasets, resulting in dataset …
Quality aspects of annotated data: A research synthesis
J Beck - AStA Wirtschafts-und Sozialstatistisches Archiv, 2023 - Springer
Abstract The quality of Machine Learning (ML) applications is commonly assessed by
quantifying how well an algorithm fits its respective training data. Yet, a perfect model that …
quantifying how well an algorithm fits its respective training data. Yet, a perfect model that …
Detectors for safe and reliable llms: Implementations, uses, and limitations
S Achintalwar, AA Garcia, A Anaby-Tavor… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output
to biased and toxic generations. Due to several limiting factors surrounding LLMs (training …
to biased and toxic generations. Due to several limiting factors surrounding LLMs (training …
A systematic review of toxicity in large language models: Definitions, datasets, detectors, detoxification methods and challenges
The emergence of the transformer architecture has ushered in a new era of possibilities,
showcasing remarkable capabilities in generative tasks exemplified by models like GPT4o …
showcasing remarkable capabilities in generative tasks exemplified by models like GPT4o …
Annotation sensitivity: Training data collection methods affect model performance
When training data are collected from human annotators, the design of the annotation
instrument, the instructions given to annotators, the characteristics of the annotators, and …
instrument, the instructions given to annotators, the characteristics of the annotators, and …
GRASP: a disagreement analysis framework to assess group associations in perspectives
Human annotation plays a core role in machine learning--annotations for supervised
models, safety guardrails for generative models, and human feedback for reinforcement …
models, safety guardrails for generative models, and human feedback for reinforcement …
Critical perspectives: A benchmark revealing pitfalls in PerspectiveAPI
Detecting “toxic” language in internet content is a pressing social and technical challenge. In
this work, we focus on Perspective API from Jigsaw, a state-of-the-art tool that promises to …
this work, we focus on Perspective API from Jigsaw, a state-of-the-art tool that promises to …
SoUnD Framework: Analyzing (So) cial Representation in (Un) structured (D) ata
Decisions about how to responsibly collect, use and document data often rely upon
understanding how people are represented in data. Yet, the unlabeled nature and scale of …
understanding how people are represented in data. Yet, the unlabeled nature and scale of …
The risks of machine learning systems
The speed and scale at which machine learning (ML) systems are deployed are
accelerating even as an increasing number of studies highlight their potential for negative …
accelerating even as an increasing number of studies highlight their potential for negative …