Študovňa Google

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM transactions on …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

Uložiť Citovať Citované 2276-krát Súvisiace články Všetky verzie 8

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Uložiť Citovať Citované 493-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Chateval: Towards better llm-based evaluators through multi-agent debate

CM Chan, W Chen, Y Su, J Yu, W Xue, S Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org

Text evaluation has historically posed significant challenges, often demanding substantial
labor and time cost. With the emergence of large language models (LLMs), researchers …

Uložiť Citovať Citované 369-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large language models are not fair evaluators

P Wang, L Li, L Chen, Z Cai, D Zhu, B Lin… - arxiv preprint arxiv …, 2023 - arxiv.org

In this paper, we uncover a systematic bias in the evaluation paradigm of adopting large
language models~(LLMs), eg, GPT-4, as a referee to score and compare the quality of …

Uložiť Citovať Citované 400-krát Súvisiace články Všetky verzie 5 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Unleashing the potential of prompt engineering in large language models: a comprehensive review

B Chen, Z Zhang, N Langrené, S Zhu - arxiv preprint arxiv:2310.14735, 2023 - arxiv.org

This comprehensive review delves into the pivotal role of prompt engineering in unleashing
the capabilities of Large Language Models (LLMs). The development of Artificial Intelligence …

Uložiť Citovať Citované 250-krát Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Aligning large language models with human: A survey

Y Wang, W Zhong, L Li, F Mi, X Zeng, W Huang… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) trained on extensive textual corpora have emerged as
leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite …

Uložiť Citovať Citované 304-krát Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Fine-tuning ChatGPT for automatic scoring

E Latif, X Zhai - Computers and Education: Artificial Intelligence, 2024 - Elsevier

This study highlights the potential of fine-tuned ChatGPT (GPT-3.5) for automatically scoring
student written constructed responses using example assessment tasks in science …

Uložiť Citovať Citované 115-krát Súvisiace články Všetky verzie 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large language model alignment: A survey

T Shen, R **, Y Huang, C Liu, W Dong, Z Guo… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent years have witnessed remarkable progress made in large language models (LLMs).
Such advancements, while garnering significant attention, have concurrently elicited various …

Uložiť Citovať Citované 164-krát Súvisiace články Všetky verzie 2 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating large language models at evaluating instruction following

Z Zeng, J Yu, T Gao, Y Meng, T Goyal… - arxiv preprint arxiv …, 2023 - arxiv.org

As research in large language models (LLMs) continues to accelerate, LLM-based
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …

Uložiť Citovať Citované 135-krát Súvisiace články Všetky verzie 5 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Prometheus: Inducing fine-grained evaluation capability in language models

S Kim, J Shin, Y Cho, J Jang, S Longpre… - The Twelfth …, 2023 - openreview.net

Recently, GPT-4 has become the de facto evaluator for long-form text generated by large
language models (LLMs). However, for practitioners and researchers with large and custom …

Uložiť Citovať Citované 144-krát Súvisiace články Všetky verzie 6 HTML verzia

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

Pandalm: An automatic evaluation benchmark for llm instruction tuning optimization

A survey on evaluation of large language models

Challenges and applications of large language models

Chateval: Towards better llm-based evaluators through multi-agent debate

Large language models are not fair evaluators

Unleashing the potential of prompt engineering in large language models: a comprehensive review

Aligning large language models with human: A survey

[HTML][HTML] Fine-tuning ChatGPT for automatic scoring

Large language model alignment: A survey

Evaluating large language models at evaluating instruction following

Prometheus: Inducing fine-grained evaluation capability in language models