Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A survey on evaluation of large language models
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …
industry, owing to their unprecedented performance in various applications. As LLMs …
Challenges and applications of large language models
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
Chateval: Towards better llm-based evaluators through multi-agent debate
Text evaluation has historically posed significant challenges, often demanding substantial
labor and time cost. With the emergence of large language models (LLMs), researchers …
labor and time cost. With the emergence of large language models (LLMs), researchers …
Large language models are not fair evaluators
In this paper, we uncover a systematic bias in the evaluation paradigm of adopting large
language models~(LLMs), eg, GPT-4, as a referee to score and compare the quality of …
language models~(LLMs), eg, GPT-4, as a referee to score and compare the quality of …
Unleashing the potential of prompt engineering in large language models: a comprehensive review
This comprehensive review delves into the pivotal role of prompt engineering in unleashing
the capabilities of Large Language Models (LLMs). The development of Artificial Intelligence …
the capabilities of Large Language Models (LLMs). The development of Artificial Intelligence …
Aligning large language models with human: A survey
Large Language Models (LLMs) trained on extensive textual corpora have emerged as
leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite …
leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite …
[HTML][HTML] Fine-tuning ChatGPT for automatic scoring
This study highlights the potential of fine-tuned ChatGPT (GPT-3.5) for automatically scoring
student written constructed responses using example assessment tasks in science …
student written constructed responses using example assessment tasks in science …
Large language model alignment: A survey
Recent years have witnessed remarkable progress made in large language models (LLMs).
Such advancements, while garnering significant attention, have concurrently elicited various …
Such advancements, while garnering significant attention, have concurrently elicited various …
Evaluating large language models at evaluating instruction following
As research in large language models (LLMs) continues to accelerate, LLM-based
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …
Prometheus: Inducing fine-grained evaluation capability in language models
Recently, GPT-4 has become the de facto evaluator for long-form text generated by large
language models (LLMs). However, for practitioners and researchers with large and custom …
language models (LLMs). However, for practitioners and researchers with large and custom …