Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
From generation to judgment: Opportunities and challenges of llm-as-a-judge
Assessment and evaluation have long been critical challenges in artificial intelligence (AI)
and natural language processing (NLP). However, traditional methods, whether matching …
and natural language processing (NLP). However, traditional methods, whether matching …
Large language models for data annotation and synthesis: A survey
Data annotation and synthesis generally refers to the labeling or generating of raw data with
relevant information, which could be used for improving the efficacy of machine learning …
relevant information, which could be used for improving the efficacy of machine learning …
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning
Recent works have shown the benefits to LLMs from fine-tuning golden-standard Chain-of-
Thought (CoT) rationales or using them as correct examples in few-shot prompting. While …
Thought (CoT) rationales or using them as correct examples in few-shot prompting. While …
Weak-to-strong reasoning
When large language models (LLMs) exceed human-level capabilities, it becomes
increasingly challenging to provide full-scale and accurate supervision for these models …
increasingly challenging to provide full-scale and accurate supervision for these models …
Language Model Preference Evaluation with Multiple Weak Evaluators
Despite the remarkable success of Large Language Models (LLMs), evaluating their outputs'
quality regarding preference remains a critical challenge. Existing works usually leverage a …
quality regarding preference remains a critical challenge. Existing works usually leverage a …