Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Understanding practices, challenges, and opportunities for user-engaged algorithm auditing in industry practice
Recent years have seen growing interest among both researchers and practitioners in user-
engaged approaches to algorithm auditing, which directly engage users in detecting …
engaged approaches to algorithm auditing, which directly engage users in detecting …
[PDF][PDF] Ai transparency in the age of llms: A human-centered research roadmap
The rise of powerful large language models (LLMs) brings about tremendous opportunities
for innovation but also looming risks for individuals and society at large. We have reached a …
for innovation but also looming risks for individuals and society at large. We have reached a …
Increasing diversity while maintaining accuracy: Text data generation with large language models and human interventions
Large language models (LLMs) can be used to generate text data for training and evaluating
other models. However, creating high-quality datasets with LLMs can be challenging. In this …
other models. However, creating high-quality datasets with LLMs can be challenging. In this …
Hierarchical text classification and its foundations: A review of current research
While collections of documents are often annotated with hierarchically structured concepts,
the benefits of these structures are rarely taken into account by classification techniques …
the benefits of these structures are rarely taken into account by classification techniques …
Toward trustworthy AI development: mechanisms for supporting verifiable claims
With the recent wave of progress in artificial intelligence (AI) has come a growing awareness
of the large-scale impacts of AI systems, and recognition that existing regulations and norms …
of the large-scale impacts of AI systems, and recognition that existing regulations and norms …
Supporting human-ai collaboration in auditing llms with llms
Large language models (LLMs) are increasingly becoming all-powerful and pervasive via
deployment in sociotechnical systems. Yet these language models, be it for classification or …
deployment in sociotechnical systems. Yet these language models, be it for classification or …
Evaluating models' local decision boundaries via contrast sets
Standard test sets for supervised learning evaluate in-distribution generalization.
Unfortunately, when a dataset has systematic gaps (eg, annotation artifacts), these …
Unfortunately, when a dataset has systematic gaps (eg, annotation artifacts), these …
Evallm: Interactive evaluation of large language model prompts on user-defined criteria
By simply composing prompts, developers can prototype novel generative applications with
Large Language Models (LLMs). To refine prototypes into products, however, developers …
Large Language Models (LLMs). To refine prototypes into products, however, developers …
HateCheck: Functional tests for hate speech detection models
Detecting online hate is a difficult task that even state-of-the-art models struggle with.
Typically, hate speech detection models are evaluated by measuring their performance on …
Typically, hate speech detection models are evaluated by measuring their performance on …
Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models
While counterfactual examples are useful for analysis and training of NLP models, current
generation methods either rely on manual labor to create very few counterfactuals, or only …
generation methods either rely on manual labor to create very few counterfactuals, or only …