Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4
KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …
An empirical survey on long document summarization: Datasets, models, and metrics
Long documents such as academic articles and business reports have been the standard
format to detail out important issues and complicated subjects that require extra attention. An …
format to detail out important issues and complicated subjects that require extra attention. An …
G-eval: NLG evaluation using gpt-4 with better human alignment
The quality of texts generated by natural language generation (NLG) systems is hard to
measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE …
measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE …
Chateval: Towards better llm-based evaluators through multi-agent debate
Text evaluation has historically posed significant challenges, often demanding substantial
labor and time cost. With the emergence of large language models (LLMs), researchers …
labor and time cost. With the emergence of large language models (LLMs), researchers …
Ragas: Automated evaluation of retrieval augmented generation
Abstract We introduce RAGAs (Retrieval Augmented Generation Assessment), a framework
for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAGAs is …
for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAGAs is …
Is chatgpt a good nlg evaluator? a preliminary study
Recently, the emergence of ChatGPT has attracted wide attention from the computational
linguistics community. Many prior studies have shown that ChatGPT achieves remarkable …
linguistics community. Many prior studies have shown that ChatGPT achieves remarkable …
News summarization and evaluation in the era of gpt-3
The recent success of zero-and few-shot prompting with models like GPT-3 has led to a
paradigm shift in NLP research. In this paper, we study its impact on text summarization …
paradigm shift in NLP research. In this paper, we study its impact on text summarization …
Benchmarking foundation models with language-model-as-an-examiner
Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …
models on open-ended question answering, which serves as a comprehensive test of a …
Towards a unified multi-dimensional evaluator for text generation
Multi-dimensional evaluation is the dominant paradigm for human evaluation in Natural
Language Generation (NLG), ie, evaluating the generated text from multiple explainable …
Language Generation (NLG), ie, evaluating the generated text from multiple explainable …
Bartscore: Evaluating generated text as text generation
A wide variety of NLP applications, such as machine translation, summarization, and dialog,
involve text generation. One major challenge for these applications is how to evaluate …
involve text generation. One major challenge for these applications is how to evaluate …