- Academic Search

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier

Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Tallenna Viittaa Viittausten määrä 270 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An empirical survey on long document summarization: Datasets, models, and metrics

HY Koh, J Ju, M Liu, S Pan - ACM computing surveys, 2022 - dl.acm.org

Long documents such as academic articles and business reports have been the standard
format to detail out important issues and complicated subjects that require extra attention. An …

Tallenna Viittaa Viittausten määrä 128 Aiheeseen liittyviä artikkeleita Kaikki 9 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

G-eval: NLG evaluation using gpt-4 with better human alignment

Y Liu, D Iter, Y Xu, S Wang, R Xu, C Zhu - arxiv preprint arxiv:2303.16634, 2023 - arxiv.org

The quality of texts generated by natural language generation (NLG) systems is hard to
measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE …

Tallenna Viittaa Viittausten määrä 984 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Chateval: Towards better llm-based evaluators through multi-agent debate

CM Chan, W Chen, Y Su, J Yu, W Xue, S Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org

Text evaluation has historically posed significant challenges, often demanding substantial
labor and time cost. With the emergence of large language models (LLMs), researchers …

Tallenna Viittaa Viittausten määrä 367 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Ragas: Automated evaluation of retrieval augmented generation

S Es, J James, LE Anke… - Proceedings of the 18th …, 2024 - aclanthology.org

Abstract We introduce RAGAs (Retrieval Augmented Generation Assessment), a framework
for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAGAs is …

Tallenna Viittaa Viittausten määrä 318 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Is chatgpt a good nlg evaluator? a preliminary study

J Wang, Y Liang, F Meng, Z Sun, H Shi, Z Li… - arxiv preprint arxiv …, 2023 - arxiv.org

Recently, the emergence of ChatGPT has attracted wide attention from the computational
linguistics community. Many prior studies have shown that ChatGPT achieves remarkable …

Tallenna Viittaa Viittausten määrä 354 Aiheeseen liittyviä artikkeleita Kaikki 9 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

News summarization and evaluation in the era of gpt-3

T Goyal, JJ Li, G Durrett - arxiv preprint arxiv:2209.12356, 2022 - arxiv.org

The recent success of zero-and few-shot prompting with models like GPT-3 has led to a
paradigm shift in NLP research. In this paper, we study its impact on text summarization …

Tallenna Viittaa Viittausten määrä 409 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Benchmarking foundation models with language-model-as-an-examiner

Y Bai, J Ying, Y Cao, X Lv, Y He… - Advances in …, 2023 - proceedings.neurips.cc

Numerous benchmarks have been established to assess the performance of foundation
models on open-ended question answering, which serves as a comprehensive test of a …

Tallenna Viittaa Viittausten määrä 123 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards a unified multi-dimensional evaluator for text generation

M Zhong, Y Liu, D Yin, Y Mao, Y Jiao, P Liu… - arxiv preprint arxiv …, 2022 - arxiv.org

Multi-dimensional evaluation is the dominant paradigm for human evaluation in Natural
Language Generation (NLG), ie, evaluating the generated text from multiple explainable …

Tallenna Viittaa Viittausten määrä 222 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bartscore: Evaluating generated text as text generation

W Yuan, G Neubig, P Liu - Advances in neural information …, 2021 - proceedings.neurips.cc

A wide variety of NLP applications, such as machine translation, summarization, and dialog,
involve text generation. One major challenge for these applications is how to evaluate …

Tallenna Viittaa Viittausten määrä 799 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota HTML-versio

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

An empirical survey on long document summarization: Datasets, models, and metrics

G-eval: NLG evaluation using gpt-4 with better human alignment

Chateval: Towards better llm-based evaluators through multi-agent debate

Ragas: Automated evaluation of retrieval augmented generation

Is chatgpt a good nlg evaluator? a preliminary study

News summarization and evaluation in the era of gpt-3

Benchmarking foundation models with language-model-as-an-examiner

Towards a unified multi-dimensional evaluator for text generation

Bartscore: Evaluating generated text as text generation