Študovňa Google

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Uložiť Citovať Citované 619-krát Súvisiace články Všetky verzie 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

GPTEval: A survey on assessments of ChatGPT and GPT-4

R Mao, G Chen, X Zhang, F Guerin… - arxiv preprint arxiv …, 2023 - arxiv.org

The emergence of ChatGPT has generated much speculation in the press about its potential
to disrupt social and economic systems. Its astonishing language ability has aroused strong …

Uložiť Citovať Citované 106-krát Súvisiace články Všetky verzie 8 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Factscore: Fine-grained atomic evaluation of factual precision in long form text generation

S Min, K Krishna, X Lyu, M Lewis, W Yih… - arxiv preprint arxiv …, 2023 - arxiv.org

Evaluating the factuality of long-form text generated by large language models (LMs) is non-
trivial because (1) generations often contain a mixture of supported and unsupported pieces …

Uložiť Citovať Citované 491-krát Súvisiace články Všetky verzie 9 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Is chatgpt a good nlg evaluator? a preliminary study

J Wang, Y Liang, F Meng, Z Sun, H Shi, Z Li… - arxiv preprint arxiv …, 2023 - arxiv.org

Recently, the emergence of ChatGPT has attracted wide attention from the computational
linguistics community. Many prior studies have shown that ChatGPT achieves remarkable …

Uložiť Citovať Citované 354-krát Súvisiace články Všetky verzie 9 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bartscore: Evaluating generated text as text generation

W Yuan, G Neubig, P Liu - Advances in neural information …, 2021 - proceedings.neurips.cc

A wide variety of NLP applications, such as machine translation, summarization, and dialog,
involve text generation. One major challenge for these applications is how to evaluate …

Uložiť Citovať Citované 799-krát Súvisiace články Všetky verzie 7 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generative judge for evaluating alignment

J Li, S Sun, W Yuan, RZ Fan, H Zhao, P Liu - arxiv preprint arxiv …, 2023 - arxiv.org

The rapid development of Large Language Models (LLMs) has substantially expanded the
range of tasks they can address. In the field of Natural Language Processing (NLP) …

Uložiť Citovať Citované 92-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An empirical survey on long document summarization: Datasets, models, and metrics

HY Koh, J Ju, M Liu, S Pan - ACM computing surveys, 2022 - dl.acm.org

Long documents such as academic articles and business reports have been the standard
format to detail out important issues and complicated subjects that require extra attention. An …

Uložiť Citovať Citované 128-krát Súvisiace články Všetky verzie 9

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Efficient methods for natural language processing: A survey

M Treviso, JU Lee, T Ji, B Aken, Q Cao… - Transactions of the …, 2023 - direct.mit.edu

Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …

Uložiť Citovať Citované 113-krát Súvisiace články Všetky verzie 10

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Human-like summarization evaluation with chatgpt

M Gao, J Ruan, R Sun, X Yin, S Yang… - arxiv preprint arxiv …, 2023 - arxiv.org

Evaluating text summarization is a challenging problem, and existing evaluation metrics are
far from satisfactory. In this study, we explored ChatGPT's ability to perform human-like …

Uložiť Citovať Citované 129-krát Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

QAFactEval: Improved QA-based factual consistency evaluation for summarization

AR Fabbri, CS Wu, W Liu, C **ong - arxiv preprint arxiv:2112.08542, 2021 - arxiv.org

Factual consistency is an essential quality of text summarization models in practical settings.
Existing work in evaluating this dimension can be broadly categorized into two lines of …

Uložiť Citovať Citované 190-krát Súvisiace články Všetky verzie 3 HTML verzia

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

Re-evaluating evaluation in text summarization

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

GPTEval: A survey on assessments of ChatGPT and GPT-4

Factscore: Fine-grained atomic evaluation of factual precision in long form text generation

Is chatgpt a good nlg evaluator? a preliminary study

Bartscore: Evaluating generated text as text generation

Generative judge for evaluating alignment

An empirical survey on long document summarization: Datasets, models, and metrics

Efficient methods for natural language processing: A survey

Human-like summarization evaluation with chatgpt

QAFactEval: Improved QA-based factual consistency evaluation for summarization